From 7001c05ec06cccf929b3e296d20d5e601d03fda8 Mon Sep 17 00:00:00 2001 From: Jerome Petazzoni Date: Sat, 18 Jun 2016 18:06:15 -0700 Subject: [PATCH] DockerCon update --- elk/logstash.conf | 34 + www/htdocs/index.html | 3992 ++++++++++------------------------------- 2 files changed, 1018 insertions(+), 3008 deletions(-) create mode 100644 elk/logstash.conf diff --git a/elk/logstash.conf b/elk/logstash.conf new file mode 100644 index 000000000..1e11e9a69 --- /dev/null +++ b/elk/logstash.conf @@ -0,0 +1,34 @@ +input { + # Listens on 514/udp and 514/tcp by default; change that to non-privileged port + syslog { port => 51415 } + # Default port is 12201/udp + gelf { } + # This generates one test event per minute. + # It is great for debugging, but you might + # want to remove it in production. + heartbeat { } +} +# The following filter is a hack! +# The "de_dot" filter would be better, but it +# is not pre-installed with logstash by default. +filter { + ruby { + code => " + event.to_hash.keys.each { |k| event[ k.gsub('.','_') ] = event.remove(k) if k.include?'.' } + " + } +} +output { + elasticsearch { + hosts => ["elasticsearch:9200"] + } + # This will output every message on stdout. + # It is great when testing your setup, but in + # production, it will probably cause problems; + # either by filling up your disks, or worse, + # by creating logging loops! BEWARE! + stdout { + codec => rubydebug + } +} + diff --git a/www/htdocs/index.html b/www/htdocs/index.html index 5a5db298a..e66a46afa 100644 --- a/www/htdocs/index.html +++ b/www/htdocs/index.html @@ -101,7 +101,11 @@ ## Logistics + + +- Hi Captains my Captains! I'm `jerome at docker dot com` -- The tutorial will run from 1:20pm to 4:40pm - -- There will be a break from 3:00pm to 3:15pm +- The tutorial will run from 1pm to 5pm -- This will be FAST PACED, but DON'T PANIC! +- There will be a break at 2:45pm (stop me if I don't!) - All the content is publicly available (slides, code samples, scripts) - Live feedback, questions, help on - [Gitter](http://container.training/chat) + [Slack](http://container.training/chat) --- @@ -162,7 +164,7 @@ --- -## Chapter 2: Swarm setup and deployment +## Chapter 2: Setting up our cluster - Dynamic orchestration - Deploying Swarm @@ -207,19 +209,33 @@ [Git BASH](https://git-for-windows.github.io/), or [MobaXterm](http://mobaxterm.mobatek.net/) -- Basic Docker knowledge -
(but that's OK if you're not a Docker expert!) +- Some Docker knowledge + + (If you're here, you definitely qualify ☺) + + + --- ## Nice-to-haves - [GitHub](https://github.com/join) account +
(if you want to fork the repo) + + + +- Slack account to connect to the Docker community Slack + - [Docker Hub](https://hub.docker.com) account
(it's one way to distribute images on your Swarm cluster) @@ -229,7 +245,7 @@ - The whole workshop is hands-on -- I will show Docker in action +- I will show Docker 1.12 in action - I invite you to reproduce what I do @@ -240,7 +256,7 @@ - This is the stuff you're supposed to do! - Go to [container.training](http://container.training/) to view these slides - Join the chat room on - [Gitter](http://container.training/chat) + [Slack](http://container.training/chat) ] @@ -300,7 +316,7 @@ - When we will use the other nodes, we will do it mostly through the Docker API -- We will use SSH only for a few "out of band" operations (mass-removing containers...) +- We will use SSH only for the initial setup and a few "out of band" operations (checking internal logs, debugging...) --- @@ -336,7 +352,7 @@ ## Brand new versions! -- Engine 1.11 +- Engine 1.12-rc-something - Compose 1.7 - Swarm 1.2 - Machine 0.6 @@ -355,18 +371,6 @@ --- -## Why are we not using the latest version of Machine? - -- The latest version of Machine is 0.7 - -- The way it deploys Swarm is different from 0.6 - -- This causes a regression in the strategy that we will use later - -- More details later! - ---- - # Our sample application - Visit the GitHub repository with all the materials of this workshop: @@ -631,8 +635,6 @@ - For that, we will use good old UNIX tools on our Docker node - - --- ## Looking at resource usage @@ -767,3954 +769,1928 @@ --- -# Connecting to containers on other hosts - -- So far, our whole stack is on a single machine +# SwarmKit -- We want to scale out (across multiple nodes) +- [SwarmKit](https://github.com/docker/swarmkit) is an open source + toolkit to build multi-node systems -- We will deploy the same stack multiple times +- It is a reusable library, like libcontainer, libnetwork, vpnkit ... -- But we want every stack to use the same Redis -
(in other words: Redis is our only *stateful* service here) +- It is a plumbing part of the Docker ecosystem --- +- SwarmKit comes with two examples: -- And remember: we're not allowed to change the code! + - `swarmctl` (a CLI tool to "speak" the SwarmKit API) - - the code connects to host `redis` - - `redis` must resolve to the address of our Redis service - - the Redis service must listen on the default port (6379) + - `swarmd` (an agent that can federate existing Docker Engines into a Swarm) -??? +- SwarmKit/swarmd/swarmctl → libcontainer/containerd/container-ctr -## Using custom DNS mapping +--- -- We could setup a Redis server on its default port +## SwarmKit features -- And add a DNS entry mapping `redis` to this server +- Highly-available, distributed store based on Raft -.exercise[ +- *Services* managed with a *declarative API* +
(implementing *desired state* and *reconciliation loop*) -- See what happens if we run: - ```bash - docker run --add-host redis:1.2.3.4 alpine ping redis - ``` +- Automatic TLS keying and signing - +- Dynamic promotion/demotion of nodes, allowing to change + how many nodes are actively part of the Raft consensus -] +- Integration with overlay networks and load balancing -There is a Compose file option for that: `extra_hosts`. +- And much more! --- -# Abstracting remote services with ambassadors - - - -- We will use an ambassador +- A *cluster* will be at least one *node* (preferably more) -- Redis will be started independently of our stack +- A *node* can be a *manager* or a *worker* -- It will run at an arbitrary location (host+port) + (Note: in SwarmKit, *managers* are also *workers*) -- In our stack, we replace `redis` with an ambassador +- A *manager* actively takes part in the Raft consensus -- The ambassador will connect to Redis +- You can talk to a *manager* using the SwarmKit API -- The ambassador will "act as" Redis in the stack +- One *manager* is elected as the *leader*; other managers merely forward requests to it --- -class: pic +## SwarmKit concepts (2/2) -![Ambassador principle](static-orchestration-1-node-a.png) +- The *managers* expose the SwarmKit API ---- +- Using the API, you can indicate that you want to run a *service* -class: pic +- A *service* is specified by its *desired state*: which image, how many instances... -![Ambassador principle](static-orchestration-1-node-b.png) +- The *leader* uses different subsystems to break down services into *tasks*: +
orchestrator, scheduler, allocator, dispatcher ---- +- A *task* corresponds to a specific container, assigned to a specific *node* -class: pic +- *Nodes* know which *tasks* should be running, and will start or stop containers accordingly (through the Docker Engine API) -![Ambassador principle](static-orchestration-1-node-c.png) +You can refer to the [NOMENCLATURE](https://github.com/docker/swarmkit/blob/master/NOMENCLATURE.md) in the SwarmKit repo for more details. --- -class: pic +## Swarm Mode -![Ambassador principle](static-orchestration-2-nodes.png) +- Docker Engine 1.12 features SwarmKit integration ---- +- The Docker CLI features three new commands: -class: pic + - `docker swarm` (enable Swarm mode; join a Swarm; adjust cluster parameters) -![Ambassador principle](static-orchestration-3-nodes.png) + - `docker node` (view nodes; promote/demote managers; manage nmodes) ---- + - `docker service` (create and manage services) -class: pic +- The Docker API exposes the same concepts -![Ambassador principle](static-orchestration-4-nodes.png) +- The SwarmKit API is also exposed (on a separate socket) --- -class: pic -![Ambassador principle](static-orchestration-5-nodes.png) +## Illustration --- -## Start redis +## You need to enable Swarm mode to use the new stuff -- Start a standalone Redis container +- By default, everything runs as usual -- Let Docker expose it on a random port +- Swarm Mode can be enabled, "unlocking" SwarmKit functions +
(services, out-of-the-box overlay networks, etc.) .exercise[ -- Run redis with a random public port: -
`docker run -d -P --name myredis redis` - -- Check which port was allocated: -
`docker port myredis 6379` +- Try a Swarm-specific command: + ``` + $ docker node ls + Error response from daemon: this node is not participating as a Swarm manager + ``` ] -- Note the IP address of the machine, and this port - --- -## Introduction to `jpetazzo/hamba` +# Creating our first Swarm + +- The cluster is initialized with `docker swarm init` -- General purpose load balancer and traffic director +- This should be executed on a first, seed node -- [Source code is available on GitHub]( - https://github.com/jpetazzo/hamba) +- .warning[DO NOT execute `docker swarm init` on multiple nodes!] -- [Public image is available on the Docker Hub]( - https://hub.docker.com/r/jpetazzo/hamba/) + You would have multiple disjoint cluster. -- Generates a configuration file for HAProxy, then starts HAProxy +.exercise[ -- Parameters are provided on the command line; for instance: +- Create our cluster from node1: ```bash - docker run -d -p 80 jpetazzo/hamba 80 www1:1234 www2:2345 - docker run -d -p 80 jpetazzo/hamba 80 www1 1234 www2 2345 + docker swarm init ``` - Those two commands do the same thing: they start a load balancer - listening on port 80, and balancing traffic across www1:1234 and www2:2345 + +] --- -## Update `docker-compose.yml` +## Checking that Swarm mode is enabled .exercise[ -- Replace `redis` with an ambassador using `jpetazzo/hamba`: - ```yaml - redis: - image: jpetazzo/hamba - command: 6379 `AA.BB.CC.DD:EEEEE` +- Run the traditional `docker info` command: + ```bash + docker info ``` - - ] -Shortcut: `docker-compose.yml-ambassador` -
(But you still have to update `AA.BB.CC.DD:EEEEE`!) +The output should include: ---- +``` +Swarm: active + NodeID: d1kf12wtm4gabh9fjzbukbu50 + IsManager: Yes + Managers: 1 + Nodes: 1 + CACertHash: sha256:330cf7e8e50a0af5d0990c1e078c709... +``` -## Start the stack on the first machine +--- -- Compose will detect the change in the `redis` service +## Running our first Swarm mode command -- It will replace `redis` with a `jpetazzo/hamba` instance +- Let's retry the exact same command as earlier .exercise[ -- Just tell Compose to do its thing: -
`docker-compose up -d` - -- Check that the stack is up and running: -
`docker-compose ps` - -- Look at the web UI to make sure that it works fine +- List the nodes (well, the only node) of our cluster: + ```bash + docker node ls + ``` ] ---- - -## Controlling other Docker Engines - -- Many tools in the ecosystem will honor the `DOCKER_HOST` environment variable - -- Those tools include (obviously!) the Docker CLI and Docker Compose +The output should look like the following: +``` +ID NAME MEMBERSHIP STATUS AVAILABILITY MANAGER STATUS +d1kf...12wt * ip-172-31-25-65 Accepted Ready Active Leader +``` -- Our training VMs have been setup to accept API requests on port 55555 -
(without authentication - this is very insecure, by the way!) +--- -- We will see later how to setup mutual authentication with certificates +## Adding nodes to the Swarm ---- +- A cluster with one node is not a lot of fun -## Setting the `DOCKER_HOST` environment variable +- Let's add `node2`! .exercise[ -- Check how many containers are running on `node1`: +- Log into `node2` and join the cluster: ```bash - docker ps + ssh node2 docker swarm join node1:2377 ``` -- Set the `DOCKER_HOST` variable to control `node2`, and compare: +- Check that the node is here indeed: ```bash - export DOCKER_HOST=tcp://node2:55555 - docker ps + docker node ls ``` ] -You shouldn't see any container running on `node2` at this point. - --- -## Start the stack on another machine +## Adding nodes using the Docker API -- We will tell Compose to bring up our stack on the other node +- We don't have to SSH into the other nodes, we can use the Docker API -- It will use the local code (we don't need to checkout the code on `node2`) +- Our nodes (for this workshop) expose the Docker API over port 55555, + without authentication (DO NOT DO THIS IN PRODUCTION; FOR EDUCATIONAL USE ONLY) .exercise[ -- Start the stack: +- Set `DOCKER_HOST` and add `node3` to the Swarm: ```bash - docker-compose up -d + DOCKER_HOST=tcp://node3:55555 docker swarm join node1:2377 ``` -] +- Check that the node is here: + ```bash + docker node ls + ``` -Note: this will build the container images on `node2`, resulting -in potentially different results from `node1`. We will see later -how to use the same images across the whole cluster. +] --- -## Run the application on every node +## Controlling who can join the cluster -- We will repeat the previous step with a little shell loop +- By default, any node can join the cluster - ... but introduce parallelism to save some time +- Let's change that and require a password for new nodes .exercise[ -- Deploy one instance of the stack on each node: - +- Update the cluster configuration: ```bash - for N in 3 4 5; do - DOCKER_HOST=tcp://node$N:55555 docker-compose up -d & - done - wait + docker swarm update --secret I_love_ponies ``` ] -Note: again, this will rebuild the container images on each node. - --- -## Scale! +## Checking the cluster configuration -- The app is built (and running!) everywhere - -- Scaling can be done very quickly +- We can see the cluster parameters with `docker swarm inspect` .exercise[ -- Add a bunch of workers all over the place: - +- Check that the secret is now in place: ```bash - for N in 1 2 3 4 5; do - DOCKER_HOST=tcp://node$N:55555 docker-compose scale worker=10 - done + docker swarm inspect ``` -- Admire the result in the web UI! - ] ---- - -## A few words about development volumes +A hashed `"Secret"` field should show up twice in the `"Policies"` section. -- Try to access the web UI on another node +--- --- +## Joining a password-protected cluster -- It doesn't work! Why? +- Let's try to add a node to this cluster --- + (Without providing a password) + +.exercise[ -- Static assets are masked by an empty volume +- Try to add node4 to the cluster: + ```bash + ssh node4 docker swarm join node1:2377 + ``` --- +] -- We need to comment out the `volumes` section +The node will be denied right away. --- -## Why must we comment out the `volumes` section? - -- Volumes have multiple uses: - - - storing persistent stuff (database files...) - - - sharing files between containers (logs, configuration...) +## Specifying the password when joining - - sharing files between host and containers (source...) +- The `docker swarm join` command also takes the `--secret` flag -- The `volumes` directive expands to an host path: +.exercise[ - `/home/docker/orchestration-workshop/dockercoins/webui/files` +- Try again, with the right secret: + ```bash + ssh node4 docker swarm join node1:2377 --secret I_love_ponies + ``` -- This host path exists on the local machine (not on the others) +- Check that the node is now in the cluster: + ```bash + docker node ls + ``` -- This specific volume is used in development (not in production) +] --- -## Stop the app +## Enabling manual vetting of nodes -- Let's use `docker-compose down` +- You can also decide to approve each node before they can join the Swarm -- It will stop and remove the DockerCoins app (but leave other containers running) +- This can be enabled independently of secrets .exercise[ -- We can do another simple parallel shell loop: +- Disable auto-accept mode, and remove the secret password: ```bash - for N in $(seq 1 5); do - export DOCKER_HOST=tcp://node$N:55555 - docker-compose down & - done - wait + docker swarm update --auto-accept none --secret "" ``` ] ---- +Note: to disable the password, we specified an empty string. -## Clean up the redis container +--- -- `docker-compose down` only removes containers defined with Compose +## Manually accepting nodes into the cluster .exercise[ -- Check that `myredis` is still there: - ```bash - unset DOCKER_HOST - docker ps - ``` - -- Remove it: +- Try to get `node5` to join: ```bash - docker rm -f myredis + ssh node5 docker swarm join node1:2377 ``` ] ---- - -## Considerations about ambassadors - -"Ambassador" is a design pattern. +You will see a message telling that the node is *pending*. -There are many ways to implement it. - -Others implementations include: +--- -- [interlock](https://github.com/ehazlett/interlock); -- [registrator](http://gliderlabs.com/registrator/latest/); -- [smartstack](http://nerds.airbnb.com/smartstack-service-discovery-cloud/); -- [zuul](https://github.com/Netflix/zuul/wiki); -- and more! +## Seeing pending nodes and accepting them in the cluster - +- See nodes (including `node5` which is currently pending): + ```bash + docker node ls + ``` -??? +- Accept `node5` in the cluster: + ``` + docker node accept XXX + ``` -## Single-tier ambassador deployment +] -- One-shot configuration process +Note: you don't have to type the node ID in full; the first characters +are enough. -- Must be executed manually after each scaling operation +--- -- Scans current state, updates load balancer configuration +## Under the hood -- Pros: -
- simple, robust, no extra moving part -
- easy to customize (thanks to simple design) -
- can deal efficiently with large changes +When we do `docker swarm init`, a TLS root CA is created. Then a keypair is issued for the first node, and signed by the root CA. -- Cons: -
- must be executed after each scaling operation -
- harder to compose different strategies +When further nodes join the Swarm, they are issued their own keypair, signed by the root CA, and they also receive the root CA public key and certificate. -- Example: this workshop +All communication is encrypted over TLS. -??? +The node keys and certificates are automatically renewed on regular intervals (by default, 90 days; this is tunable with `docker swarm update`). -## Two-tier ambassador deployment +As we could see, nodes can join automatically or be approved manually; in both cases, this can be done with or without a pre-shared secret; and this policy can be changed during the lifecycle of the cluster without restarting or breaking anything. -- Daemon listens to Docker events API +--- -- Reacts to container start/stop events +## Promoting nodes to be managers -- Adds/removes back-ends to load balancers configuration +- Right now, we have only one manager (node1) -- Pros: -
- no extra step required when scaling up/down +- If we lose it, we're SOL -- Cons: -
- extra process to run and maintain -
- deals with one event at a time (ordering matters) +- Let's make our cluster highly available -- Hidden gotcha: load balancer creation +.exercise[ -- Example: interlock +- See the current list of nodes: + ``` + docker node ls + ``` -??? +- Promote a couple of nodes to be managers: + ``` + docker promote XXX YYY + ``` -## Three-tier ambassador deployment +] +--- -- Daemon listens to Docker events API +## You can control the Swarm from any manager node -- Reacts to container start/stop events +.exercise[ -- Adds/removes scaled services in distributed config DB (Zookeeper, etcd, Consul…) +- Try the following command on a few different nodes: + ```bash + ssh nodeX docker node ls + ``` -- Another daemon listens to config DB events, -
adds/removes backends to load balancers configuration +] -- Pros: -
- more flexibility +On manager nodes: +
you will see the list of nodes, with a `*` denoting +the node you're talking to. -- Cons: -
- three extra services to run and maintain +On non-manager nodes: +
you will get an error message telling you that +the node is not a manager. -- Example: registrator +You can only control the Swarm through a manager node. --- -## Ambassadors and overlay networks - -- Overlay networks allow direct multi-host communication +# Running our first Swarm service -- Ambassadors are still useful to implement other tasks: +- How do we run services? Simplified version: - - load balancing; + `docker run` → `docker service create` - - credentials injection; - - - instrumentation; - - - fail-over; - - - etc. +.exercise[ ---- +- Create a service featuring an Alpine container pinging Google resolvers: + ```bash + docker service create alpine ping 8.8.8.8 + ``` -class: title +- Check where the container was created: + ```bash + docker service tasks + ``` -# Dynamic orchestration +] --- -## Static vs Dynamic - -- Static +## Checking container logs - - you decide what goes where +- Right now, there is no direct way to check the logs of our container +
(unless it was scheduled on the current node) - - simple to describe and implement +- Look up the `NODE` on which the container is running + (in the output of the `docker service tasks` command) - - seems easy at first but doesn't scale efficiently - -- Dynamic - - - the system decides what goes where +.exercise[ - - requires extra components (HA KV...) +- Log into the node: + ```bash + ssh ip-172-31-XXX-XXX + ``` - - scaling can be finer-grained, more efficient +] --- -class: pic +## Viewing the logs of the container -## Hands-on Swarm +- We need to be logged into the node running the container -![Swarm Logo](swarm.png) +.exercise[ ---- +- See that the container is running and check its ID: + ```bash + docker ps + ``` -## Swarm (in theory) +- View its logs: + ```bash + docker logs + ``` -- Consolidates multiple Docker hosts into a single one +] -- You talk to Swarm using the Docker API +--- - → you can use all existing tools: Docker CLI, Docker Compose, etc. +## Scale our service -- Swarm talks to your Docker Engines using the Docker API too +- Services can be scaled in a pinch with the `docker service update` + command - → you can use existing Engines without modification +.exercise[ -- Dispatches (schedules) your containers across the cluster, transparently +- Scale the service to ensure 2 copies per node: + ```bash + docker service update --replicas 10 + ``` -- Open source and written in Go (like the Docker Engine) +- Check that we have two containers on the current node: + ```bash + docker ps + ``` -- Initial design and implementation by [@aluzzardi](https://twitter.com/aluzzardi) and [@vieux](https://twitter.com/vieux), - who were also the authors of the first versions of the Docker Engine +] --- -## Swarm (in practice) - -- Stable since November 2015 +## Expose a service -- Easy to setup (compared to other orchestrators) +- Services can be exposed, with two special properties: -- Tested with 1000 nodes + 50000 containers -
.small[(without particular tuning; see DockerCon EU opening keynotes!)] + - the public port is available on *every node of the Swarm*, -- Requires a key/value store for advanced features + - requests coming on the public port are load balanced across all instances. -- Can use Consul, etcd, or Zookeeper +- This is achieved with option `-p/--publish`; as an approximation: ---- - -# Deploying Swarm + `docker run -p → docker service create -p` -- Components involved: +- If you indicate a single port number, it will be mapped on a port + start at 30000 +
(vs. 32768 for single container mapping) - - cluster discovery mechanism -
(so that the manager can learn about the nodes) +- You can indicate two port numbers to set the public port number - - Swarm manager -
(your frontend to the cluster) - - - Swarm agent -
(runs on each node, registers it with service discovery) + (Just like with `docker run -p`) --- -## Cluster discovery +## Expose ElasticSearch on its default port -- Possible backends: +.exercise[ - - dynamic, self-hosted -
(requires to run a Consul/etcd/Zookeeper cluster) +- Create an ElasticSearch service (and give it a name while we're at it): + ```bash + docker service create --name search --publish 9200:9200 --replicas 7 \ + elasticsearch + ``` - - static, through command-line or file -
(great for testing, or for private subnets, see [this article]( - https://medium.com/on-docker/docker-swarm-flat-file-engine-discovery-2b23516c71d4#.6vp94h5wn)) +- Check what's going on: + ```bash + watch docker service tasks search + ``` - - external, token-based -
(dynamic; nothing to operate; relies on external service operated by Docker Inc.) +] --- -## Swarm agent - -- Used only for dynamic discovery (ZK, etcd, Consul, token) +## Tasks lifecycle -- Must run on each node +- If you are fast enough, you will be able to see multiple states: -- Every 20s (by default), tells to the discovery system: - - *"Hello, there is a Swarm node at A.B.C.D:EFGH"* + - accepted (the task has been assigned to a specific node) + - preparing (right now, this mostly means "pulling the image") + - running -- Must know the node's IP address - - (It cannot figure it out by itself, because it doesn't know whether to use public or private addresses) +- When a task is terminated (stopped, killed...) it cannot be restarted -- The node continues to work even if the agent dies + (A replacement task will be created) --- -## Swarm manager +## Test our service -- Accepts Docker API requests - -- Communicates with the cluster nodes - -- Performs healthchecks, scheduling... - ---- +- We mapped port 9200 on the nodes, to port 9200 in the containers -# Picking a key/value store +- Let's try to reach that port! -- We are going to use a key/value store, and use it for: +.exercise[ - - cluster membership discovery +- Repeat the following command a few times: + ```bash + curl localhost:9200 + ``` - - overlay networks backend - - - resilient storage of important credentials - - - Swarm leader election +] -- We are going to use Consul, and run one Consul instance on each node +Each request should be served by a different ElasticSearch instance. - (That way, we can always access Consul over localhost) +(You will see each instance advertising a different name.) --- -## Do we really need a key/value store? +## Terminate our services -- Cluster membership discovery doesn't *require* a key/value store +- Before moving on, we will remove those services - (We could use the token mechanism instead) +- `docker service rm` can accept multiple services names or IDs -- Network overlays don't *require* a key/value store +- `docker service ls` can accept the `-q` flag - (We could use a plugin like Weave instead) +- A Shell snippet a day keeps the cruft away -- Credentials can be distributed through other mechanisms +.exercise[ - (E.g. copying them to a private S3 bucket) +- Remove all services with this one liner: + ```bash + docker service ls -q | xargs docker service rm + ``` -- Swarm leader election, however, requires a key/value store +] --- -## Why are we using a key/value store, then? +class: title -- Each aforementioned mechanism requires some reliable, distributed storage +# Our app on Swarm -- If we don't use our own key/value store, we end up using *something else*: +--- - - Docker Inc.'s centralized token discovery service +## What's on the menu? - - [Weave's CRDT protocol](https://github.com/weaveworks/weave/wiki/IP-allocation-design) +In this part, we will cover: - - AWS S3 (or your cloud provider's equivalent, or some other file storage system) +- building images for our app, -- Each of those is one extra potential point of failure +- shipping those images with a registry, -- See for instance [Kyle Kingsbury's analysis of Chronos](https://aphyr.com/posts/326-jepsen-chronos) for an illustration of this problem +- running them through the services concept, -- By operating our own key/value store, we have 1 extra service instead of 3 (or more) +- enabling inter-container communication with overlay networks. --- -## Should we always use a key/value store? +## Why do we need to ship our images? --- - -- No! - --- +- When we do `docker-compose up`, images are built for our services -- If you don't want to operate your own key/value store, don't do it +- Those images are present only on the local node -- You might be more comfortable using tokens + Weave + S3, for instance +- We need those images to be distributed on the whole Swarm -- You can also use static discovery +- The easiest way to achieve that is to use a Docker registry -- Maybe you don't even need overlay networks +- Once our images are on a registry, we can reference them when + creating our services --- -## Why Consul? +## Build, ship, and run, for a single service -- Consul is not the "official" or best way to do this +If we had only one service (built from a `Dockerfile` in the +current directory), our workflow could look like this: -- This is an arbitrary decision made by Truly Yours - -- I *personally* find Consul easier to setup for a workshop like this +``` +docker build -t jpetazzo/doublerainbow:v0.1 . +docker push jpetazzo/doublerainbow:v0.1 +docker service create jpetazzo/doublerainbow:v0.1 +``` -- ... But etcd and Zookeper will work too! +We just have to adapt this to our application, which has 4 services! --- -## Setting up our Swarm cluster - -We need to: - -- create certificates, +## The plan -- distribute them on our nodes, +- Build on our local node (`node1`) -- run the Swarm agent on every node, +- Tag images with a version number -- run the Swarm manager on `node1`, + (timestamp; git hash; semantic...) -- reconfigure the Engine on each node to add extra flags (for overlay networks). +- Upload them to a registry -That's a lot of work, so we'll use Docker Machine to automate this. +- Update the Compose file to use those images --- -## Using Docker Machine to setup a Swarm cluster +## Which registry do we want to use? + +.small[ -- Docker Machine has two primary uses: +- **Docker Hub** - - provisioning cloud instances running the Docker Engine + - hosted by Docker Inc. + - requires an account (free, no credit card needed) + - images will be public (unless you pay) + - located in AWS EC2 us-east-1 - - managing local Docker VMs within e.g. VirtualBox +- **Docker Trusted Registry** -- It can also create Swarm clusters, and will: + - self-hosted commercial product + - requires a subscription (free 30-day trial available) + - images can be public or private + - located wherever you want - - create and manage certificates +- **Docker open source registry** - - automatically start swarm agent and manager containers + - self-hosted barebones repository hosting + - doesn't require anything + - doesn't come with anything either + - located wherever you want -- It comes with a special driver, `generic`, to (re)configure existing machines +] --- -## Setting up Docker Machine +## Using Docker Hub -- Install `docker-machine` (single binary download) +- Set the `DOCKER_REGISTRY` environment variable to your Docker Hub user name +
(the `build-tag-push.py` script prefixes each image name with that variable) - (This is already done on your VMs!) +- We will also see how to run the open source registry +
(so use whatever option you want!) -- Set a few environment variables (cloud credentials) - ```bash - export AWS_ACCESS_KEY_ID=AKI... - export AWS_SECRET_ACCESS_KEY=... - export AWS_DEFAULT_REGION=eu-west-2 - export DIGITALOCEAN_ACCESS_TOKEN=... - export DIGITALOCEAN_SIZE=2gb - export AZURE_SUBSCRIPTION_ID=... - ``` - - (We already have 5 nodes, so we don't need to do this!) +.exercise[ ---- + -## Creating nodes with Docker Machine +- Set the following environment variable: +
`export DOCKER_REGISTRY=jpetazzo` -- The only two mandatory parameters are the driver to use, and the machine name: - ```bash - docker-machine create -d digitalocean node42 - ``` +- (Use *your* Docker Hub login, of course!) -- *Tons* of parameters can be specified; see [Docker Machine driver documentation](https://docs.docker.com/machine/drivers/) +- Log into the Docker Hub: +
`docker login` -- To list machines and their status: - ```bash - docker-machine ls - ``` + -- To destroy a machine: - ```bash - docker-machine rm node42 - ``` +] --- -## Communicating with nodes managed by Docker Machine - -- Select a machine for use: - ```bash - eval $(docker-machine env node42) - ``` - This will set a few environment variables (at least `DOCKER_HOST`). +## Using Docker Trusted Registry -- Execute regular commands with Docker, Compose, etc. +If we wanted to use DTR, we would: - (They will pick up remote host address from environment) +- make sure we have a Docker Hub account +- [activate a Docker Datacenter subscription]( + https://hub.docker.com/enterprise/trial/) +- install DTR on our machines +- set `DOCKER_REGISTRY` to `dtraddress:port/user` -- If you need to go under the hood, you can get SSH access: - ```bash - docker-machine ssh node42 - ``` +*This is out of the scope of this workshop!* --- -## Docker Machine `generic` driver - -- Most drivers work the same way: - - - use cloud API to create instance - - - connect to instance over SSH +## Using open source registry - - install Docker +- We need to run a `registry:2` container +
(make sure you specify tag `:2` to run the new version!) -- The `generic` driver skips the first step +- It will store images and layers to the local filesystem +
(but you can add a config file to use S3, Swift, etc.) -- It can install Docker on any machine, as long as you have SSH access +- Docker *requires* TLS when communicating with the registry + + - unless for registries on `localhost` + + - or with the Engine flag `--insecure-registry` -- We will use that! +- Our strategy: publish the registry container on port 5000, +
and connect to it through `localhost:5000` on each node --- -## Setting up Swarm with Docker Machine - -When invoking Machine, we will provide three sets of parameters: - -- the machine driver to use (`generic`) and the SSH connection information - -- Swarm-specific options indicating the cluster membership discovery mechanism - -- Extra flags to be passed to the Engine, to enable overlay networks - ---- +# Deploying a local registry -## Provisioning the first node +- We will create a single-instance service, publishing its port + on the whole cluster .exercise[ -- Use the following command to provision the manager node: - - +- Try the following command, until it returns `{"repositories":[]}`: ```bash - docker-machine create --driver generic \ - --engine-opt cluster-store=consul://localhost:8500 \ - --engine-opt cluster-advertise=eth0:2376 \ - --swarm --swarm-master --swarm-discovery consul://localhost:8500 \ - --generic-ssh-user docker --generic-ip-address `AA.BB.CC.DD` node1 + curl localhost:5000/v2/_catalog ``` ] ---- +(Retry a few times, it might take 10-20 seconds for the container to be started. Patience.) -## Provisioning the other nodes +--- -- The command is almost the same, but without the `--swarm-master` flag +## Testing our local registry -- We will use a shell snippet for convenience +- We can retag a small image, and push it to the registry .exercise[ -```bash - grep node[2345] /etc/hosts | grep -v ^127 | - while read IPADDR NODENAME - do docker-machine create --driver generic \ - --engine-opt cluster-store=consul://localhost:8500 \ - --engine-opt cluster-advertise=eth0:2376 \ - --swarm --swarm-discovery consul://localhost:8500 \ - --generic-ssh-user docker \ - --generic-ip-address $IPADDR $NODENAME - done -``` +- Make sure we have the busybox image, and retag it: + ```bash + docker pull busybox + docker tag busybox localhost:5000/busybox + ``` + +- Push it: + ```bash + docker push localhost:5000/busybox + ``` ] --- -## Check what we did +## Checking what's on our local registry -Let's connect to the first node *individually*. +- The registry API has endpoints to query what's there .exercise[ -- Select the node with Machine - - ```bash - eval $(docker-machine env node1) - ``` - -- Execute some Docker commands - +- Ensure that our busybox image is now in the local registry: ```bash - docker version - docker info + curl http://localhost:5000/v2/_catalog ``` ] -In the output of `docker info`, we should see `Cluster store` and `Cluster advertise`. +The curl command should now output: +```json +{"repositories":["busybox"]} +``` --- -## Interact with the node +## Build, tag, and push our application container images -Let's try a few basic Docker commands on this node. +- Scriptery to the rescue! .exercise[ -- Run a simple container: - ```bash - docker run --rm busybox echo hello world - ``` +- Set `DOCKER_REGISTRY` and `TAG` environment variablesto use our local registry -- See running containers: +- And run this little for loop: ```bash - docker ps + DOCKER_REGISTRY=localhost:5000 + TAG=v0.1 + for SERVICE in hasher rng webui worker; do + docker-compose build $SERVICE + docker tag dockercoins_$SERVICE $DOCKER_REGISTRY/dockercoins_$SERVICE:$TAG + docker push $DOCKER_REGISTRY/dockercoins_$SERVICE + done ``` ] -Two containers should show up: the agent and the manager. - --- -## Connect to the Swarm cluster +# Overlay networks -Now, let's try the same operations, but when talking to the Swarm manager. +- SwarmKit integrates with overlay networks, without requiring + an extra key/value store -.exercise[ +- Overlay networks are created the same way as before -- Select the Swarm manager with Machine: +.exercise[ +- Create an overlay network for our application: ```bash - eval $(docker-machine env node1 --swarm) + docker network create --driver overlay dockercoins ``` -- Execute some Docker commands - +- Check existing networks: ```bash - docker version - docker info - docker ps + docker network ls ``` ] -The output is different! Let's review this. - --- -## `docker version` - -Swarm identifies itself clearly: +## Can you spot the difference? -``` -Client: - Version: 1.11.1 - API version: 1.23 - Go version: go1.5.4 - Git commit: 5604cbe - Built: Tue Apr 26 23:38:55 2016 - OS/Arch: linux/amd64 - -Server: - Version: swarm/1.2.2 - API version: 1.22 - Go version: go1.5.4 - Git commit: 34e3da3 - Built: Mon May 9 17:03:22 UTC 2016 - OS/Arch: linux/amd64 -``` +The `dockercoins` network is different from the other ones. ---- +Can you see how? -## `docker info` +-- -The output of `docker info` on Swarm shows a number of differences from -the output on a single Engine: +It is using a different kind of ID, reflecting that it's a SwarmKit object +instead of a "classic" Docker Engine object. -.small[ -``` -Containers: 0 - Running: 0 - Paused: 0 - Stopped: 0 -Images: 0 -Server Version: swarm/1.2.2 -Role: primary -Strategy: spread -Filters: health, port, containerslots, dependency, affinity, constraint -Nodes: 0 -Plugins: - Volume: - Network: -Kernel Version: 4.2.0-36-generic -Operating System: linux -Architecture: amd64 -CPUs: 0 -Total Memory: 0 B -Name: node1 -Docker Root Dir: -Debug mode (client): false -Debug mode (server): false -WARNING: No kernel memory limit support -``` -] --- -## Why zero nodes? +## Caveats -- We haven't started Consul yet - -- Swarm discovery is not operational - -- Swarm can't discover the nodes - -Note: Docker will start (and be functional) without a K/V store. - -This lets us run Consul itself in a container. +.warning[As I type those lines (i.e. before boarding the plane to Seattle), +it is not possible yet to join an overlay network with `docker run --net ...`; +this might or might not be enabled in the future. We will see how to cope +with this limitation.] --- -## Adding Consul +## Run the application -- We will run Consul in containers +- First, start the redis service; that one is using a Docker Hub image -- We will use the [Consul official image]( - https://hub.docker.com/_/consul/) that was released *very recently* +.exercise[ -- We will tell Docker to automatically restart it on reboots +- Create the redis service: + ```bash + docker service create --network dockercoins --name redis redis + ``` -- To simplify network setup, we will use `host` networking +] --- -## A few words about `host` networking +## Run the other services -- Consul needs to be aware of its actual IP address (seen by other nodes) +- Then, start the other services one by one -- It also binds a bunch of different ports +- We will use the images pushed previously -- It makes sense (from a security point of view) to have Consul listening on localhost only +.exercise[ - (and have "users", i.e. Engine, Swarm, etc. connect over localhost) +- Start the other services: + ```bash + DOCKER_REGISTRY=localhost:5000 + TAG=v0.1 + for SERVICE in hasher rng webui worker; do + docker service create --network dockercoins --name $SERVICE \ + $DOCKER_REGISTRY/dockercoins_$SERVICE:$TAG + done + ``` -- Therefore, we will use `host` networking! +] -- Also: Docker Machine 0.6 starts the Swarm containers in `host` networking ... +--- -- ... but Docker Machine 0.7 doesn't (which is why we stick to 0.6 for now) +## Wait for our application to be up ---- +- We will see later a way to watch progress for all the tasks of the cluster -## Consul fundamentals (if I must give you just one slide...) +- But for now, a scrappy Shell loop will do the trick -- Consul nodes can be "just an agent" or "server" +.exercise[ -- From the client's perspective, they behave the same +- Repeatedly display the status of all our services: + ```bash + watch "docker service ls -q | xargs -n1 docker service tasks" + ``` -- Only servers are members in the Raft consensus / leader election / etc +- Stop it once everything is running - (non-server agents forward requests to a server) +] -- All nodes must be told the address of at least another node to join +--- - (except for the first node, where this is optional) +## Expose our application web UI -- At least the first nodes must know how many nodes to expect to have quorum +- We need to connect to the `webui` service, but it is not publishing any port -- Consul can have only one "truth" at a time (hence the importance of quorum) +- Let's reconfigure it to publish a port ---- +- **Unfortunately,** dynamic port update doesn't work yet -## Starting our Consul cluster + (So we will `rm` and re-`create` the service instead) .exercise[ -- Make sure you're logged into `node1`, and: - +- Destroy the existing `webui` service and recreate it with the published port: ```bash - IPADDR=$(ip a ls dev eth0 | sed -n 's,.*inet \(.*\)/.*,\1,p') - for N in 1 2 3 4 5; do - ssh node$N -- docker run -d --restart=always --name consul_node$N \ - -e CONSUL_BIND_INTERFACE=eth0 --net host consul \ - agent -server -retry-join $IPADDR -bootstrap-expect 5 \ - -ui -client 0.0.0.0 - done + docker service rm webui + docker service create --network dockercoins --name webui \ + --publish 8000:80 $DOCKER_REGISTRY/dockercoins_webui:$TAG ``` ] -Note: in production, you probably want to remove `-client 0.0.0.0` since it -gives public access to your cluster! Also adapt `-bootstrap-expect` to your quorum. - ---- - -## Check that our Consul cluster is up - -- With your browser, navigate to any instance on port 8500 -
(in "NODES" you should see the five nodes) - -- Let's run a couple of useful Consul commands +??? .exercise[ -- Ask Consul the list of members it knows: +- Update `webui` so that we can connect to it from outside: ```bash - docker run --net host --rm consul members + docker service update webui --publish 8000:80 ``` -- Ask Consul which node is the current leader: +- Check as it's updated: ```bash - curl localhost:8500/v1/status/leader + watch docker service tasks webui ``` ] --- -## Check that our Swarm cluster is up +## Connect to the web UI -.exercise[ +- The web UI is now available on port 8000, *on all the nodes of the cluster* -- Try again the `docker info` from earlier: +.exercise[ - ```bash - eval $(docker-machine env --swarm node1) - docker info - docker ps - ``` +- Point your browser to any node, on port 8000 ] -All nodes should be visible. (If not, give them a minute or two to register.) - -The Consul containers should be visible. - -The Swarm containers, however, are hidden by Swarm (unless you use `docker ps -a`). - --- -# Running containers on Swarm +## Scaling the application -Try to run a few `busybox` containers. +- We can change scaling parameters with `docker update` as well -Then, let's get serious: +- We will do the equivalent of `docker-compose scale` .exercise[ -- Start a Redis service: -
`docker run -dP redis` +- Bring up more workers: + ```bash + docker service update worker --replicas 10 + ``` -- See the service address: -
`docker port $(docker ps -lq) 6379` +- Check the result in the web UI ] -This can be any of your five nodes. - ---- - -## Scheduling strategies - -- Random: pick a node at random -
(but honor resource constraints) - -- Spread: pick the node with the least containers -
(including stopped containers) - -- Binpack: try to maximize resource usage -
(in other words: use as few hosts as possible) +You should see the performance peaking at 10 hashes/s (like before). --- -# Resource allocation +## Scaling the `rng` service -- Swarm can honor resource reservations +- We want to utilize as best as we can the entropy generators + on our nodes -- This requires containers to be started with resource limits +- We want to run exactly one `rng` instance per node -- Swarm refuses to schedule a container if it cannot honor a reservation +- SwarmKit has a special scheduling mode for that, let's use it .exercise[ -- Start Redis containers with 1 GB of RAM until Swarm refuses to start more: +- Enable *global scheduling* for the `rng` service: ```bash - docker run -d -m 1G redis + docker service update rng --mode global ``` -] +- Look at the result in the web UI -On a cluster of 5 nodes with ~3.8 GB of RAM per node, Swarm will refuse to start the 16th container. +] --- -## Removing our Redis containers - -- Let's use a little bit of shell scripting - -.exercise[ - -- Remove all containers using the redis image: - ```bash - docker ps | awk '/redis/ {print $1}' | xargs docker rm -f - ``` - -] - -??? +## Checkpoint -## Things to know about resource allocation +- We've seen how to setup a Swarm -- `docker info` shows resource allocation for each node +- We've used it to host our own registry -- Swarm allows a 5% resource overcommit (tunable) +- We've built our app container images -- Containers without resource reservation can always be started +- We've used the registry to host those images -- Resources of stopped containers are still counted as being reserved +- We've deployed and scaled our application - - this guarantees that it will be possible to restart a stopped container +Let's treat ourselves with a nice pat in the back! - - containers have to be deleted to free up their resources +-- - - `docker update` can be used to change resource allocation on the fly +And carry on, we have much more to see and learn! --- class: title -# Setting up overlay networks +# Operating the Swarm --- -# Multi-host networking - -- Docker 1.9 has the concept of *networks* - -- By default, containers are on the default "bridge" network - -- You can create additional networks +## Finding the real cause of the bottleneck -- Containers can be on multiple networks +- We want to debug our app as we scale `worker` up and down -- Containers can dynamically join/leave networks +- We want to run tools like `ab` or `httping` on the internal network -- The "overlay" driver lets networks span multiple hosts +- .warning[This will be very hackish] -- Containers can have "network aliases" resolvable through DNS + (Better techniques and tools might become available in the future!) --- -## Manipulating networks, names, and aliases +# Breaking into an overlay network -- The preferred method is to let Compose do the heavy lifting for us +- We will create a dummy placeholder service on our network - (YAML-defined networking!) - -- But if we really need to, we can use the Docker CLI, with: - - `docker network ...` - - `docker run --net ... --net-alias ...` - -- The following slides illustrate those commands - ---- - -## Create a few networks and containers +- Then we will use `docker exec` to run more processes in this container .exercise[ -- Create two networks, *blue* and *green*: - ```bash - docker network create blue - docker network create green - docker network ls - ``` - -- Create containers with names of blue and green - things, on their respective networks: +- Start a "do nothing" container using our favorite Swiss-Army distro: ```bash - docker run -d --net-alias things --name sky --net blue -m 3G redis - docker run -d --net-alias things --name navy --net blue -m 3G redis - docker run -d --net-alias things --name grass --net green -m 3G redis - docker run -d --net-alias things --name forest --net green -m 3G redis + docker service create --network dockercoins --name debug --mode global \ + alpine sleep 1000000000 ``` ] +Why am I using global scheduling here? Because I'm lazy! +
+With global scheduling, I'm *guaranteed* to have an instance on the local node. +
+I don't need to SSH to another node. + --- -## Check connectivity within networks +## Entering the debug container -.exercise[ +- Once our container is started (which should be really fast because the alpine image is small), we can enter it (from any node) -- Check that our containers are on different nodes: +.exercise[ +- Locate the container: ```bash docker ps ``` -- This will work: - - ```bash - docker run --rm --net blue alpine ping -c 3 navy - ``` - -- This will not: - +- Enter it: ```bash - docker run --rm --net blue alpine ping -c 3 grass + docker exec -ti sh ``` ] -??? +--- + +## Labels -## Containers connected to multiple networks +- We can also be fancy and find the ID of the container automatically -- Some colors aren't *quite* blue *nor* green +- SwarmKit places labels on containers .exercise[ -- Create a container that we want to be on both networks: +- Get the ID of the container: ```bash - docker run -d --net-alias things --net blue --name turquoise redis + CID=$(docker ps -q --filter label=com.docker.swarm.service.name=debug) ``` -- Check connectivity: +- And enter the container: ```bash - docker exec -ti turquoise ping -c 3 navy - docker exec -ti turquoise ping -c 3 grass + docker exec -ti $CID sh ``` - (First works; second doesn't) ] -??? +--- -## Dynamically connecting containers +## Installing our debugging tools -- This is achieved with the command: -
`docker network connect NETNAME CONTAINER` +- Ideally, you would author your own image, with all your favorite tools, and use it instead of the base `alpine` image -.exercise[ +- But we can also dynamically install whatever we need -- Dynamically connect to the green network: - ```bash - docker network connect green turquoise - ``` +.exercise[ -- Check connectivity: +- Install a few tools: ```bash - docker exec -ti turquoise ping -c 3 navy - docker exec -ti turquoise ping -c 3 grass + apk add --update curl apache2-utils drill ``` - (Both commands work now) ] --- -## Network aliases +## Investigating the `rng` service -- Each container was created with the network alias `things` - -- Network aliases are scoped by network +- First, let's check what `rng` resolves to .exercise[ -- Resolve the `things` alias from both networks: +- Use drill or nslookup to resolve `rng`: ```bash - docker run --rm --net blue alpine nslookup things - docker run --rm --net green alpine nslookup things + drill rng ``` ] -??? - -## Under the hood +This give us one IP address. It is not the IP address of a container. +It is a virtual IP address (VIP) for the `rng` service. -- Each network has an interface in the container +--- -- There is also an interface for the default gateway +## Investigating the VIP .exercise[ -- View interfaces in our `turquoise` container: +- Try to ping the VIP: ```bash - docker exec -ti turquoise ip addr ls + ping rng ``` ] -??? +It doesn't respond to ping at this point. (This might change in the future.) -## Dynamically disconnecting containers +--- -- There is a mirror command to `docker network connect` +## What if I don't like VIPs? -.exercise[ +- Services can be published using two modes: VIP and DNSRR. -- Disconnect the *turquoise* container from *blue* - (its original network): - ```bash - docker network disconnect blue turquoise - ``` +- With VIP, you get a virtual IP for the service, and an load balancer + based on IPVS -- Check connectivity: - ```bash - docker exec -ti turquoise ping -c 3 navy - docker exec -ti turquoise ping -c 3 grass - ``` - (First command fails, second one works) + (By the way, IPVS is totally awesome and if you want to learn more about it in the context of containers, + I highly recommend [this talk](https://www.youtube.com/watch?v=oFsJVV1btDU&index=5&list=PLkA60AVN3hh87OoVra6MHf2L4UR9xwJkv) by [@kobolog](https://twitter.com/kobolog) at DC15EU!) -] +- With DNSRR, you get the former behavior (from Engine 1.11), where + resolving the service yields the IP addresses of all the containers for + this service ---- +- You change this with `docker service create --endpoint-mode [VIP|DNSRR]` -## Cleaning up +--- -.exercise[ +## Testing and benchmarking our service -- Destroy containers: +- We will check that the service is up with `rng`, then + benchmark it with `ab` - +.exercise[ +- Make a test request to the service: ```bash - docker rm -f sky navy grass forest + curl rng ``` -- Destroy networks: - +- Open another window, and stop the workers, to test in isolation: ```bash - docker network rm blue - docker network rm green + docker service update worker --replicas 0 ``` ] ---- - -## Cleaning up after an outage or a crash - -- You cannot remove a network if it still has containers - -- There is no `"rm -f"` for network - -- If a network still has stale endpoints, you can use `"disconnect -f"` - ---- - -class: title - -# Building images with Swarm +Wait until the workers are stopped (check with `docker service ls`) +before continuing. --- -## Building images with Swarm - -- Special care must be taken when building and running images - -- We *can* build images on Swarm (with `docker build` or `docker-compose build`) - -- One node will be picked at random, and the build will happen there - -- At the end of the build, the image will be present *only on that node* - ---- +## Benchmarking `rng` -## Building on Swarm can yield inconsistent results +We will send 50 requests, but with various levels of concurrency. -- Builds are scheduled on random nodes +.exercise[ -- Multiple builds and rebuilds can happen on different nodes +- Send 50 requests, with a single sequential client: + ```bash + ab -c 1 -n 50 http://rng/10 + ``` -- If a build happens on a different node, the cache of the previous build cannot be used +- Send 50 requests, with fifty parallel clients: + ```bash + ab -c 50 -n 50 http://rng/10 + ``` -- Worse: you can have two different images with the same name on your cluster +] --- -## Scaling won't work as expected - -Consider the following scenario: +## Benchmark results for `rng` -- `docker-compose up` -
- → each service is built on a node, and runs there - -- `docker-compose scale` -
- → additional containers for this service can only be spawned where the image was built - -- `docker-compose up` (again) -
- → services might be built (and started) on different nodes - -- `docker-compose scale` -
- → containers can be spawned with both the new and old images - ---- - -## Scaling correctly with Swarm - -- After building an image, it should be distributed to the cluster - - (Or made available through a registry, so that nodes can download it automatically) - -- Instead of referencing images with the `:latest` tag, unique tags should be used - - (Using e.g. timestamps, version numbers, or VCS hashes) - ---- - -## Why can't Swarm do this automatically for us? - -- Let's step back and think for a minute ... - -- What should `docker build` do on Swarm? - - - build on one machine - - - build everywhere ($$$) - -- After the build, what should `docker run` do? - - - run where we built (how do we know where it is?) - - - run on any machine that has the image - -- Could Compose+Swarm solve this automatically? - ---- - -## A few words about "sane defaults" - -- *It would be nice if Swarm could pick a node, and build there!* - - - but which node should it pick? - - what if the build is very expensive? - - what if we want to distribute the build across nodes? - - what if we want to tag some builder nodes? - - ok but what if no node has been tagged? - -- *It would be nice if Swarm could automatically push images!* - - - using the Docker Hub is an easy choice -
(you just need an account) - - but some of us can't/won't use Docker Hub -
(for compliance reasons or because no network access) - -.small[("Sane" defaults are nice only if we agree on the definition of "sane")] - ---- - -## The plan - -- Build on a single node (`node1`) - -- Tag images with the current UNIX timestamp (for simplicity) - -- Upload them to a registry - -- Update the Compose file to use those images - -This is all automated with the [`build-tag-push.py` script](https://github.com/jpetazzo/orchestration-workshop/blob/master/bin/build-tag-push.py). - ---- - -## Which registry do we want to use? - -.small[ - -- **Docker Hub** - - - hosted by Docker Inc. - - requires an account (free, no credit card needed) - - images will be public (unless you pay) - - located in AWS EC2 us-east-1 - -- **Docker Trusted Registry** - - - self-hosted commercial product - - requires a subscription (free 30-day trial available) - - images can be public or private - - located wherever you want - -- **Docker open source registry** - - - self-hosted barebones repository hosting - - doesn't require anything - - doesn't come with anything either - - located wherever you want - -] - ---- - -## Using Docker Hub - -- Set the `DOCKER_REGISTRY` environment variable to your Docker Hub user name -
(the `build-tag-push.py` script prefixes each image name with that variable) - -- We will also see how to run the open source registry -
(so use whatever option you want!) - -.exercise[ - - - -- Set the following environment variable: -
`export DOCKER_REGISTRY=jpetazzo` - -- (Use *your* Docker Hub login, of course!) - -- Log into the Docker Hub: -
`docker login` - - - -] - ---- - -## Using Docker Trusted Registry - -If we wanted to use DTR, we would: - -- make sure we have a Docker Hub account -- [activate a Docker Datacenter subscription]( - https://hub.docker.com/enterprise/trial/) -- install DTR on our machines -- set `DOCKER_REGISTRY` to `dtraddress:port/user` - -*This is out of the scope of this workshop!* - ---- - -## Using open source registry - -- We need to run a `registry:2` container -
(make sure you specify tag `:2` to run the new version!) - -- It will store images and layers to the local filesystem -
(but you can add a config file to use S3, Swift, etc.) - -- Docker *requires* TLS when communicating with the registry, - unless for registries on `localhost` or with the Engine - flag `--insecure-registry` - -- Our strategy: run a reverse proxy on `localhost:5000` on each node - ---- - -## Registry frontends and backend - -![Registry frontends](registry-frontends.png) - ---- - -# Deploying a local registry - -- There is a Compose file for that - -.exercise[ - -- Go to the `registry` directory in the repository: - ```bash - cd ~/orchestration-workshop/registry - ``` - -] - -Let's examine the `docker-compose.yml` file. - ---- - -## Running a local registry with Compose - -```yaml -version: "2" - -services: - backend: - image: registry:2 - frontend: - image: jpetazzo/hamba - command: 5000 backend:5000 - ports: - - "127.0.0.1:5000:5000" - depends_on: - - backend -``` - -- *Backend* is the actual registry. -- *Frontend* is the ambassador that we deployed earlier. -
-It communicates with *backend* using an internal network -and network aliases. - ---- - -## Starting a local registry with Compose - -- We will bring up the registry - -- Then we will ensure that one *frontend* is running - on each node by scaling it to our number of nodes - -.exercise[ - -- Start the registry: - ```bash - docker-compose up -d - ``` - -] - ---- - -## "Scaling" the local registry - -- This is a particular kind of scaling - -- We just want to ensure that one *frontend* - is running on every single node of the cluster - -.exercise[ - -- Scale the registry: - ```bash - for N in $(seq 1 5); do - docker-compose scale frontend=$N - done - ``` - -] - -Note: Swarm might do that automatically for us in the future. - ---- - -## Testing our local registry - -- We can retag a small image, and push it to the registry - -.exercise[ - -- Make sure we have the busybox image, and retag it: - ```bash - docker pull busybox - docker tag busybox localhost:5000/busybox - ``` - -- Push it: - ```bash - docker push localhost:5000/busybox - ``` - -] - ---- - -## Checking what's on our local registry - -- The registry API has endpoints to query what's there - -.exercise[ - -- Ensure that our busybox image is now in the local registry: - ```bash - curl http://localhost:5000/v2/_catalog - ``` - -] - -The curl command should output: -```json -{"repositories":["busybox"]} -``` - ---- - -## Adapting our Compose file to run on Swarm - -- We can get rid of all the `ports` section, except for the web UI - -.exercise[ - -- Go back to the dockercoins directory: - ```bash - cd ~/orchestration-workshop/dockercoins - ``` - -] - ---- - -## Our new Compose file - -.small[ -```yaml -version: '2' - -services: - rng: - build: rng - - hasher: - build: hasher - - webui: - build: webui - ports: - - "8000:80" - - redis: - image: redis - - worker: - build: worker -``` -] - -Copy-paste this into `docker-compose.yml` -
(or you can `cp docker-compose.yml-v2 docker-compose.yml`) - ---- - -## Use images, not builds - -- We need to replace each `build` with an `image` - -- We will use the `build-tag-push.py` script for that - -.exercise[ - -- Set `DOCKER_REGISTRY` to use our local registry - -- Make sure that you are building on `node1` - -- Then run the script - - ```bash - export DOCKER_REGISTRY=localhost:5000 - eval $(docker-machine env node1) - ../bin/build-tag-push.py - ``` - -] - ---- - -## Run the application - -- At this point, our app is ready to run - -.exercise[ - -- Start the application: - ```bash - export COMPOSE_FILE=docker-compose.yml-`NNN` - eval $(docker-machine env node1 --swarm) - docker-compose up -d - ``` - -- Observe that it's running on multiple nodes: -
(each container name is prefixed with the node it's running on) - ```bash - docker ps - ``` - -] - ---- - -## View the performance graph - -- Load up the graph in the browser - -.exercise[ - -- Check the `webui` service address and port: - ```bash - docker-compose port webui 80 - ``` - -- Open it in your browser - -] - ---- - -## Scaling workers - -- Scaling the `worker` service works out of the box - (like before) - -.exercise[ - -- Scale `worker`: - ```bash - docker-compose scale worker=10 - ``` - -] - -Check that workers are on different nodes. - -However, we hit the same bottleneck as before. - -How can we address that? - ---- - -## Finding the real cause of the bottleneck - -- If time permits, we can benchmark `rng` and `hasher` to find out more - -- Otherwise, we'll fast-forward a bit - ---- - -## Benchmarking in isolation - -- If we want the benchmark to be accurate, we need to make sure that `rng` and `hasher` are not receiving traffic - -.exercise[ - -- Stop the `worker` containers: - ```bash - docker-compose kill worker - ``` - -] - ---- - -## A better benchmarking tool - -- Instead of `httping`, we will now use `ab` (Apache Bench) - -- We will install it in an `alpine` container placed on the network used by our application - -.exercise[ - -- Start an interactive `alpine` container on the `dockercoins_rng` network: - ```bash - docker run -ti --net dockercoins_default alpine sh - ``` - -- Install `ab` with the `apache2-utils` package: - ```bash - apk add --update apache2-utils - ``` - -] - ---- - -## Benchmarking `rng` - -We will send 50 requests, but with various levels of concurrency. - -.exercise[ - -- Send 50 requests, with a single sequential client: - ```bash - ab -c 1 -n 50 http://rng/10 - ``` - -- Send 50 requests, with ten parallel clients: - ```bash - ab -c 10 -n 50 http://rng/10 - ``` - -] - ---- - -## Benchmark results for `rng` - -- In both cases, the benchmark takes ~5 seconds to complete - -- When serving requests sequentially, they each take 100ms - -- In the parallel scenario, the latency increased dramatically: - - - one request is served in 100ms - - another is served in 200ms - - another is served in 300ms - - ... - - another is served in 1000ms - -- What about `hasher`? - ---- - -## Benchmarking `hasher` - -We will do the same tests for `hasher`. - -The command is slightly more complex, since we need to post random data. - -First, we need to put the POST payload in a temporary file. - -.exercise[ - -- Install curl in the container, and generate 10 bytes of random data: - ```bash - apk add curl - curl http://rng/10 >/tmp/random - ``` - -] - ---- - -## Benchmarking `hasher` - -Once again, we will send 50 requests, with different levels of concurrency. - -.exercise[ - -- Send 50 requests with a sequential client: - ```bash - ab -c 1 -n 50 -T application/octet-stream \ - -p /tmp/random http://hasher/ - ``` - -- Send 50 requests with 10 parallel clients: - ```bash - ab -c 10 -n 50 -T application/octet-stream \ - -p /tmp/random http://hasher/ - ``` - -] - ---- - -## Benchmark results for `hasher` - -- The sequential benchmarks takes ~5 seconds to complete - -- The parallel benchmark takes less than 1 second to complete - -- In both cases, each request takes a bit more than 100ms to complete - -- Requests are a bit slower in the parallel benchmark - -- It looks like `hasher` is better equiped to deal with concurrency than `rng` - ---- - -class: title - -Why? - ---- - -## Why does everything take (at least) 100ms? - --- - -`rng` code: - -![RNG code screenshot](delay-rng.png) - --- - -`hasher` code: - -![HASHER code screenshot](delay-hasher.png) - ---- - -class: title - -But ... - -WHY?!? - ---- - -## Why did we sprinkle this sample app with sleeps? - -- Deterministic performance -
(regardless of instance speed, CPUs, I/O...) - --- - -- Actual code sleeps all the time anyway - --- - -- When your code makes a remote API call: - - - it sends a request; - - - it sleeps until it gets the response; - - - it processes the response. - ---- - -## Why do `rng` and `hasher` behave differently? - -![Equations on a blackboard](equations.png) - --- - -(Synchronous vs. asynchronous event processing) - ---- - -## How to make `rng` go faster - -- Obvious solution: comment out the `sleep` instruction - --- - -- Unfortunately, in the real world, network latency exists - --- - -- More realistic solution: use an asynchronous framework -
(e.g. use gunicorn with gevent) - --- - -- Reminder: we can't change the code! - --- - -- Solution: scale out `rng` -
(dispatch `rng` requests on multiple instances) - ---- - -# Scaling web services with Compose on Swarm - -- We *can* scale network services with Compose - -- The result may or may not be satisfactory, though! - -.exercise[ - -- Restart the `worker` service: - ```bash - docker-compose start worker - ``` - -- Scale the `rng` service: - ```bash - docker-compose scale rng=5 - ``` - -] - ---- - -## Results - -- In the web UI, you might see a performance increase ... or maybe not - --- - -- Since Engine 1.11, we get round-robin DNS records - - (i.e. resolving `rng` will yield the IP addresses of all 3 containers) - -- Docker randomizes the records it sends - -- But many resolvers will sort them in unexpected ways - -- Depending on various factors, you could get: - - - all traffic on a single container - - traffic perfectly balanced on all containers - - traffic unevenly balanced across containers - ---- - -## Assessing DNS randomness - -- Let's see how our containers resolve DNS requests - -.exercise[ - -- On each of our 10 scaled workers, execute 5 ping requests: - ```bash - for N in $(seq 1 10); do - echo PING__________$N - for I in $(seq 1 5); do - docker exec -ti dockercoins_worker_$N ping -c1 rng - done - done | grep PING - ``` - -] - -(The 7th Might Surprise You!) - ---- - -## DNS randomness - -- Other programs can yield different results - -- Same program on another distro can yield different results - -- Same source code with another libc or resolver can yield different results - -- Running the same test at different times can yield different results - -- Did I mention that Your Results May Vary? - ---- - -## Implementing fair load balancing - -- Instead of relying on DNS round robin, let's use a proper load balancer - -- Use Compose to create multiple copies of the `rng` service - -- Put a load balancer in front of them - -- Point other services to the load balancer - ---- - -## Naming problem - -- The service is called `rng` - -- Therefore, it is reachable with the network name `rng` - -- Our application code (the `worker` service) connects to `rng` - -- So the name `rng` should resolve to the load balancer - -- What should we do‽ - ---- - -## Naming is *per-network* - -- Solution: put `rng` on its own network - -- That way, it doesn't take the network name `rng` -
(at least not on the default network) - -- Have the load balancer sit on both networks - -- Add the name `rng` to the load balancer - ---- - -class: pic - -Original DockerCoins - -![](dockercoins-single-node.png) - ---- - -class: pic - -Load-balanced DockerCoins - -![](dockercoins-multi-node.png) - ---- - -## Declaring networks - -- Networks (other than the default one) - *must* be declared - in a top-level `networks` section, - placed anywhere in the file - -.exercise[ - -- Add the `rng` network to the Compose file, `docker-compose.yml-NNN`: - ```yaml - version: '2' - - networks: - rng: - - services: - rng: - image: ... - ... - ``` - -] - ---- - -## Putting the `rng` service in its network - -- Services can have a `networks` section - -- If they don't: they are placed in the default network - -- If they do: they are placed only in the mentioned networks - -.exercise[ - -- Change the `rng` service to put it in its network: - ```yaml - rng: - image: localhost:5000/dockercoins_rng:… - networks: - rng: - ``` - -] - ---- - -## Adding the load balancer - -- The load balancer has to be in both networks: `rng` and `default` -- In the `default` network, it must have the `rng` alias -- We will use the `jpetazzo/hamba` image - -.exercise[ - -- Add the `rng-lb` service to the Compose file: - ```yaml - rng-lb: - image: jpetazzo/hamba - command: run - networks: - rng: - default: - aliases: [ rng ] - ``` -] - ---- - -## Load balancer initial configuration - -- We specified `run` as the initial command - -- This tells `hamba` to wait for an initial configuration - -- The load balancer will not be operational (until we feed it its configuration) - ---- - -## Start the application - -.exercise[ - -- Bring up DockerCoins: - ```bash - docker-compose up -d - ``` - -- See that `worker` is complaining: - ```bash - docker-compose logs --tail 100 --follow worker - ``` -] - ---- - -## Add one backend to the load balancer - -- Multiple solutions: - - - lookup the IP address of the `rng` backend - - use the backend's network name - - use the backend's container name (easiest!) - -.exercise[ - -- Configure the load balancer: - ```bash - docker run --rm --volumes-from dockercoins_rng-lb_1 \ - --net container:dockercoins_rng-lb_1 \ - jpetazzo/hamba reconfigure 80 dockercoins_rng_1 80 - ``` - -] - -The application should now be working correctly. - ---- - -## Add all backends to the load balancer - -- The command is similar to the one before - -- We need to pass the list of all backends - -.exercise[ - -- Reconfigure the load balancer: - ```bash - docker run --rm \ - --volumes-from dockercoins_rng-lb_1 \ - --net container:dockercoins_rng-lb_1 \ - jpetazzo/hamba reconfigure 80 \ - $(for N in $(seq 1 5); do - echo dockercoins_rng_$N:80 - done) - ``` - -] - ---- - -## Automating the process - -- Nobody loves artisan YAML handicraft - -- This can be scripted very easily - -- But can it be fully automated? - ---- - -## Use DNS to discover the addresses of all the backends - -- When multiple containers have the same network alias: - - - Engine 1.10 returns only one of them (the same one across the whole network) - - - Engine 1.11 returns all of them (in a random order) - -- A "smart" client can use all records to implement load balancing - -- We can compose `jpetazzo/hamba` with a special-purpose container, - which will dynamically generate HAProxy's configuration when - the DNS records are updated - ---- - -## Introducing `jpetazzo/watchdns` - -- [100 lines of pure POSIX scriptery]( - https://github.com/jpetazzo/watchdns/blob/master/watchdns) - -- Resolves a given DNS name every second - -- Each time the result changes, a new HAProxy configuration is generated - -- When used together with `--volumes-from` and `jpetazzo/hamba`, it - updates the configuration of an existing load balancer - -- Comes with a companion script, [`add-load-balancer-v2.py`](https://github.com/jpetazzo/orchestration-workshop/blob/master/bin/add-load-balancer-v2.py), to update your Compose files - ---- - -## Using `jpetazzo/watchdns` - -.exercise[ - -- First, revert the Compose file to remove the load balancer - -- Then, run `add-load-balancer-v2.py`: - ```bash - ../bin/add-load-balancer-v2.py rng - ``` - -- Inspect the resulting Compose file - -] - ---- - -## Scaling with `watchdns` - -.exercise[ - -- Start the application with the new sidekick containers: - ```bash - docker-compose up -d - ``` - -- Scale `rng`: - ```bash - docker-compose scale rng=10 - ``` - -- Check logs: - ```bash - docker-compose logs rng-wd - ``` - -] - ---- - -## Comments - -- This is a very crude implementation of the pattern - -- A Go version would only be a bit longer, but use much less resources - -- When there are many backends, reacting quickly to change is less important - - (i.e. it's not necessary to re-resolve records every second!) - ---- - -class: title - -# All things ops
(logs, backups, and more) - ---- - -# Logs - -- Two strategies: - - - log to plain files on volumes - - - log to stdout -
(and use a logging driver) - ---- - -## Logging to plain files on volumes - -(Sorry, that part won't be hands-on!) - -- Start a container with `-v /logs` - -- Make sure that all log files are in `/logs` - -- To check logs, run e.g. - - ```bash - docker run --volumes-from ... ubuntu sh -c "grep WARN /logs/*.log" - ``` - -- Or just go interactive: - - ```bash - docker run --volumes-from ... -ti ubuntu - ``` - -- You can (should) start a log shipper that way - ---- - -## Logging to stdout - -- All containers should write to stdout/stderr - -- Docker will collect logs and pass them to a logging driver - -- Logging driver can specified globally, and per container -
(changing it for a container overrides the global setting) - -- To change the global logging driver, pass extra flags to the daemon -
(requires a daemon restart) - -- To override the logging driver for a container, pass extra flags to `docker run` - ---- - -## Specifying logging flags - -- `--log-driver` - - *selects the driver* - -- `--log-opt key=val` - - *adds driver-specific options* -
*(can be repeated multiple times)* - -- The flags are identical for `docker daemon` and `docker run` - ---- - -## Logging flags in practice - -- If you provision your nodes with Docker Machine, - you can set global logging flags (which will apply to all - containers started by a given Engine) like this: - - ```bash - docker-machine create ... --engine-opt log-driver=... - ``` - -- Otherwise, use your favorite method to edit or manage configuration files - -- You can set per-container logging options in Compose files - ---- - -## Available drivers - -- json-file (default) - -- syslog (can send to UDP, TCP, TCP+TLS, UNIX sockets) - -- awslogs (AWS CloudWatch) - -- journald - -- gelf - -- fluentd - -- splunk - ---- - -## About json-file ... - -- It doesn't rotate logs by default, so your disks will fill up - - (Unless you set `maxsize` *and* `maxfile` log options.) - -- It's the only one supporting logs retrieval - - (If you want to use `docker logs`, `docker-compose logs`, - or fetch logs from the Docker API, you need json-file!) - -- This might change in the future - - (But it's complex since there is no standard protocol - to *retrieve* log entries.) - -All about logging in the documentation: -https://docs.docker.com/reference/logging/overview/ - ---- - -# Setting up ELK to store container logs - -*Important foreword: this is not an "official" or "recommended" -setup; it is just an example. We do not endorse ELK, GELF, -or the other elements of the stack more than others!* - -What we will do: - -- Spin up an ELK stack, with Compose - -- Gaze at the spiffy Kibana web UI - -- Manually send a few log entries over GELF - -- Reconfigure our DockerCoins app to send logs to ELK - ---- - -## What's in an ELK stack? - -- ELK is three components: - - - ElasticSearch (to store and index log entries) - - - Logstash (to receive log entries from various - sources, process them, and forward them to various - destinations) - - - Kibana (to view/search log entries with a nice UI) - -- The only component that we will configure is Logstash - -- We will accept log entries using the GELF protocol - -- Log entries will be stored in ElasticSearch, -
and displayed on Logstash's stdout for debugging - ---- - -## Starting our ELK stack - -- We will use a *separate* Compose file - -- The Compose file is in the `elk` directory - -.exercise[ - -- Go to the `elk` directory: - ```bash - cd ~/orchestration-workshop/elk - ``` - -- Start the ELK stack: - ```bash - unset COMPOSE_FILE - docker-compose up -d - ``` - -] - ---- - -## Making sure that each node has a local logstash - -- We will configure each container to send logs to `localhost:12201` - -- We need to make sure that each node has a logstash container listening on port 12201 - -.exercise[ - -- Scale the `logstash` service to 5 instances (one per node): - ```bash - for N in $(seq 1 5); do - docker-compose scale logstash=$N - done - ``` - -] - ---- - -## Checking that our ELK stack works - -- Our default Logstash configuration sends a test - message every minute - -- All messages are stored into ElasticSearch, - but also shown on Logstash stdout - -.exercise[ - -- Look at Logstash stdout: - ```bash - docker-compose logs logstash - ``` - -] - -After less than one minute, you should see a `"message" => "ok"` -in the output. - ---- - -## Connect to Kibana - -- Our ELK stack exposes two public services: -
the Kibana web server, and the GELF UDP socket - -- They are both exposed on their default port numbers -
(5601 for Kibana, 12201 for GELF) - -.exercise[ - -- Check the address of the node running kibana: - ```bash - docker-compose ps - ``` - -- Open the UI in your browser: http://instance-address:5601/ - -] - ---- - -## "Configuring" Kibana - -- If you see a status page with a yellow item, wait a minute and reload - (Kibana is probably still initializing) - -- Kibana should offer you to "Configure an index pattern", - just click the "Create" button - -- Then: - - - click "Discover" (in the top-left corner) - - click "Last 15 minutes" (in the top-right corner) - - click "Last 1 hour" (in the list in the middle) - - click "Auto-refresh" (top-right corner) - - click "5 seconds" (top-left of the list) - -- You should see a series of green bars (with one new green bar every minute) - ---- - -![Screenshot of Kibana](kibana.png) - ---- - -## Sending container output to Kibana - -- We will create a simple container displaying "hello world" - -- We will override the container logging driver - -- The GELF address is `127.0.0.1:12201`, because the Compose file - explicitly exposes the GELF socket on port 12201 - -.exercise[ - -- Start our one-off container: - - ```bash - docker run --rm --log-driver gelf \ - --log-opt gelf-address=udp://127.0.0.1:12201 \ - alpine echo hello world - ``` - -] - ---- - -## Visualizing container logs in Kibana - -- Less than 5 seconds later (the refresh rate of the UI), - the log line should be visible in the web UI - -- We can customize the web UI to be more readable - -.exercise[ - -- In the left column, move the mouse over the following - columns, and click the "Add" button that appears: - - - host - - container_name - - message - -] - ---- - -## Switching back to the DockerCoins application - -.exercise[ - -- Go back to the dockercoins directory: - ```bash - cd ~/orchestration-workshop/dockercoins - ``` - -- Set the `COMPOSE_FILE` variable: - ```bash - export COMPOSE_FILE=docker-compose.yml-`NNN` - ``` - -] - ---- -## Add the logging driver to the Compose file - -- We need to add the logging section to each container - -.exercise[ - -- Edit the `docker-compose.yml-NNN` file, adding the following lines **to each container**: - - ```yaml - logging: - driver: gelf - options: - gelf-address: "udp://127.0.0.1:12201" - ``` - -] - -There is also a script, [`../bin/add-logging.py`](https://github.com/jpetazzo/orchestration-workshop/blob/master/bin/add-logging.py), to do that automatically. - ---- - -## Update the DockerCoins app - -.exercise[ - -- Use Compose normally: - ```bash - docker-compose up -d - ``` - -] - -If you look in the Kibana web UI, you will see log lines -refreshed every 5 seconds. - -Note: to do interesting things (graphs, searches...) we -would need to create indexes. This is beyond the scope -of this workshop. - ---- - -## Logging in production - -- If we were using an ELK stack: - - - scale ElasticSearch - - interpose a Redis or Kafka queue to deal with bursts - -- Configure your Engines to send all logs to ELK by default - -- Start the logging containers with a different logging system -
(to avoid a logging loop) - -- Make sure you don't end up writing *all logs* on the nodes running Logstash! - ---- - -# Network traffic analysis - -- We want to inspect the network traffic entering/leaving `dockercoins_redis_1` - -- We will use *shared network namespaces* to perform network analysis - -- Two containers sharing the same network namespace... - - - have the same IP addresses - - - have the same network interfaces - -- `eth0` is therefore the same in both containers - ---- - -## Install and start `ngrep` - -Ngrep uses libpcap (like tcpdump) to sniff network traffic. - -.exercise[ - - - -- Start a container with the same network namespace: -
`docker run --net container:dockercoins_redis_1 -ti alpine sh` - -- Install ngrep: -
`apk update && apk add ngrep` - -- Run ngrep: -
`ngrep -tpd eth0 -Wbyline . tcp` - - - -] - -You should see a stream of Redis requests and responses. - ---- - -# Backups - -- We want to enable backups for `dockercoins_redis_1` - -- We don't want to install extra software in this container - -- We will use a special backup container: - - - sharing the same volumes - - - using the same network stack (to connect to it easily) - - - possibly containing our backup tools - -- This works because the `redis` container image stores its data on a volume - ---- - -## Starting the backup container - -- We will use the `--net container:` option to be able to connect locally - -- We will use the `--volumes-from` option to access the container's persistent data - -.exercise[ - - - -- Start the container: - - ```bash - docker run --net container:dockercoins_redis_1 \ - --volumes-from dockercoins_redis_1:ro \ - -v /tmp/myredis:/output \ - -ti alpine sh - ``` - -- Look in `/data` in the container (that's where Redis puts its data dumps) -] - ---- - -## Connecting to Redis - -- We need to tell Redis to perform a data dump *now* - -.exercise[ - -- Connect to Redis: - ```bash - telnet localhost 6379 - ``` - -- Issue commands `SAVE` then `QUIT` - -- Look at `/data` again (notice the time stamps) - -] - -- There should be a recent dump file now! - ---- - -## Getting the dump out of the container - -- We could use many things: - - - s3cmd to copy to S3 - - SSH to copy to a remote host - - gzip/bzip/etc before copying - -- We'll just copy it to the Docker host - -.exercise[ - -- Copy the file from `/data` to `/output` - -- Exit the container - -- Look into `/tmp/myredis` (on the host) - - - -] - ---- - -## Scheduling backups - -In the "old world," we (generally) use cron. - -With containers, what are our options? - --- - -- run `cron` on the Docker host, and put `docker run` in the crontab - --- - -- run `cron` in the backup container, and make sure it keeps running -
(e.g. with `docker run --restart=…`) - --- - -- run `cron` in a container, and start backup containers from there - --- - -- listen to the Docker events stream, automatically scheduling backups -
when database containers are started - ---- - -# Controlling Docker from a container - -- In a local environment, just bind-mount the Docker control socket: - ```bash - docker run -ti -v /var/run/docker.sock:/var/run/docker.sock docker - ``` - -- Otherwise, you have to: - - - set `DOCKER_HOST`, - - set `DOCKER_TLS_VERIFY` and `DOCKER_CERT_PATH` (if you use TLS), - - copy certificates to the container that will need API access. - -More resources on this topic: - -- [Do not use Docker-in-Docker for CI]( - http://jpetazzo.github.io/2015/09/03/do-not-use-docker-in-docker-for-ci/) -- [One container to rule them all]( - http://jpetazzo.github.io/2016/04/03/one-container-to-rule-them-all/) - ---- - -# Docker events stream - -- Using the Docker API, we can get real-time - notifications of everything happening in the Engine: - - - container creation/destruction - - container start/stop - - container exit/signal/out of memory - - container attach/detach - - volume creation/destruction - - network creation/destruction - - connection/disconnection of containers - ---- - -## Subscribing to the events stream - -- This is done with `docker events` - -.exercise[ - -- Get a stream of events: - ```bash - docker events - ``` - - - -- In a new terminal, do *anything*: - ```bash - docker run --rm alpine sleep 10 - ``` - -] - -You should see events for the lifecycle of the -container, as well as its connection/disconnection -to the default `bridge` network. - ---- - -## A few tools to use the events stream - -- [docker-spotter](https://github.com/discordianfish/docker-spotter) - - Written in Go; simple building block to use directly in Shell scripts - -- [ahab](https://github.com/instacart/ahab) - - Written in Python; available as a library; ships with a CLI tool - ---- - -# Security upgrades - -- This section is not hands-on - -- Public Service Announcement - -- We'll discuss: +- When serving requests sequentially, they each take 100ms - - how to upgrade the Docker daemon +- In the parallel scenario, the latency increased dramatically: - - how to upgrade container images +- What about `hasher`? --- -## Upgrading the Docker daemon +## Benchmarking `hasher` + +We will do the same tests for `hasher`. -- Stop all containers cleanly +The command is slightly more complex, since we need to post random data. -- Stop the Docker daemon +First, we need to put the POST payload in a temporary file. -- Upgrade the Docker daemon +.exercise[ -- Start the Docker daemon +- Install curl in the container, and generate 10 bytes of random data: + ```bash + curl http://rng/10 >/tmp/random + ``` -- Start all containers +] -- This is like upgrading your Linux kernel, but it will get better +--- -(Docker Engine 1.11 is using containerd, which will ultimately allow seamless upgrades.) +## Benchmarking `hasher` -??? +Once again, we will send 50 requests, with different levels of concurrency. -## In practice +.exercise[ -- Keep track of running containers before stopping the Engine: +- Send 50 requests with a sequential client: ```bash - docker ps --no-trunc -q | - tee /tmp/running | - xargs -n1 -P10 docker stop + ab -c 1 -n 50 -T application/octet-stream -p /tmp/random http://hasher/ ``` -- Restart those containers after the Engine is running again: +- Send 50 requests with 50 parallel clients: ```bash - xargs docker start < /tmp/running + ab -c 50 -n 50 -T application/octet-stream -p /tmp/random http://hasher/ ``` -
(Run this multiple times if you have linked containers!) ---- +] -## Upgrading container images +--- -- When a vulnerability is announced: +## Benchmark results for `hasher` - - if it affects your base images: make sure they are fixed first +- The sequential benchmarks takes ~5 seconds to complete - - if it affects downloaded packages: make sure they are fixed first +- The parallel benchmark takes less than 1 second to complete - - re-pull base images +- In both cases, each request takes a bit more than 100ms to complete - - rebuild +- Requests are a bit slower in the parallel benchmark - - restart containers +- It looks like `hasher` is better equiped to deal with concurrency than `rng` --- -## How do we know when to upgrade? - -- Subscribe to CVE notifications - - - https://cve.mitre.org/ +class: title - - your distros' security announcements +Why? -- Check CVE status in official images -
(tag [cve-tracker]( - https://github.com/docker-library/official-images/labels/cve-tracker) - in [docker-library/official-images]( - https://github.com/docker-library/official-images/labels/cve-tracker) - repo) +--- -- Use a container vulnerability scanner -
(e.g. [Docker Security Scanning](https://blog.docker.com/2016/05/docker-security-scanning/)) +## Why does everything take (at least) 100ms? ---- +-- -## Upgrading with Compose +`rng` code: -Compose makes this particularly easy: -```bash -docker-compose build --pull --no-cache -docker-compose up -d -``` +![RNG code screenshot](delay-rng.png) -This will automatically: +-- -- pull base images; -- rebuild all container images; -- bring up the new containers. +`hasher` code: -Remember: Compose will automatically move our -volumes to the new containers, so data is preserved. +![HASHER code screenshot](delay-hasher.png) --- class: title -# Resiliency
and
high availability +But ... + +WHY?!? --- -## What are our single points of failure? - -- The TLS certificates created by Machine are on `node1` +## Why did we sprinkle this sample app with sleeps? -- We have only one Swarm manager +- Deterministic performance +
(regardless of instance speed, CPUs, I/O...) -- If a node (running containers) is down or unreachable, - our application will be affected +-- ---- +- Actual code sleeps all the time anyway -# Distributing Machine credentials +-- -- All the credentials (TLS keys and certs) are on node1 -
(the node on which we ran `docker-machine create`) +- When your code makes a remote API call: -- If we lose node1, we're toast + - it sends a request; -- We need to move (or copy) the credentials somewhere safe + - it sleeps until it gets the response; -- Credentials are regular files, and relatively small + - it processes the response. -- Ah, if only we had a highly available, hierarchic store ... +--- --- +## Why do `rng` and `hasher` behave differently? -- Wait a minute, we have one! +![Equations on a blackboard](equations.png) -- -(That's Consul, if you were wondering) +(Synchronous vs. asynchronous event processing) --- -## Storing files in Consul +# Rolling updates -- We will use [Benjamin Wester's consulfs]( - https://github.com/bwester/consulfs) +- We want to release a new version of the worker -- It mounts a Consul key/value store as a local filesystem +- We will edit the code ... -- Performance will be horrible -
(don't run a database on top of that!) +- ... build the new image ... -- But to store files of a few KB, nobody will notice +- ... push it to the registry ... -- We will copy/link/sync... `~/.docker/machine` to Consul +- ... update our service to use the new image --- -## Installing consulfs - -- Option 1: install Go, git clone, go build ... +## But first... -- Option 2: be lazy and use [jpetazzo/consulfs]( - https://hub.docker.com/r/jpetazzo/consulfs/) +- Restart the workers .exercise[ -- Be lazy and use the Docker image: +- Just scale back to 10 replicas: ```bash - eval $(docker-machine env node1) - docker run --rm -v /usr/local/bin:/target jpetazzo/consulfs + docker service update worker --replicas 10 ``` -] -Note: the `jpetazzo/consulfs` image contains the -`consulfs` binary. +- Check that they're running: + ```bash + docker service tasks worker + ``` -It copies it to `/target` (if `/target` is a volume). +] --- -## Can't we run consulfs in a container? +## Making changes -- Yes we can! +.exercise[ -- The filesystem will be mounted in the container +- Edit `~/orchestration-workshop/dockercoins/worker/worker.py` -- It won't be visible outside of the container (from the host) +- Locate the line that has a `sleep` instruction -- We can use *shared mounts* to propagate mounts from containers to Docker +- Reduce the `sleep` from `0.1` to `0.01` -- But propagating from Docker to the host requires particular systemd flags +- Save your changes and exit -- ... So we'll run it on the host for now +] --- -## Running consulfs - -- The `consulfs` binary takes two arguments: - - - the Consul server address - - a mount point (that has to be created first) +## Building and pushing the new image .exercise[ -- Create a mount point and mount Consul as a local filesystem: +- Build the new image: ```bash - mkdir ~/consul - consulfs localhost:8500 ~/consul + IMAGE=localhost:5000/dockercoins_worker:v0.01 + docker build -t $IMAGE worker + ``` + +- Push it to the registry: + ```bash + docker push $IMAGE ``` ] -Leave this running in the foreground. +Note how the build and push were fast (because caching). --- -## Checking our consulfs mount point - -- All key/values will be visible: +## Watching the deployment process - - Swarm discovery - - - overlay networks - - - ... anything you put in Consul! +- We will need to open a new window for this .exercise[ -- Check that Consul key/values are visible: +- Look at our service status: ```bash - ls -l ~/consul/ + watch -n1 "docker service tasks worker -a | grep -v Shutdown.*Shutdown" ``` ] ---- +- `-a` gives us all tasks, including the one whose current or desired state is `Shutdown` + +- Then we filter out the tasks whose current **and** desired state is `Shutdown` -## Copying our credentials to Consul +- Future versions will have fancy filters to make that less tinkerish -- Use standard UNIX commands +--- + +## Updating to our new image -- Don't try to preserve permissions, though (`consulfs` doesn't store permissions) +- Keep the `watch ...` command running! .exercise[ -- Copy Machine credentials into Consul: +- In the other window, update the service to the new image: ```bash - cp -r ~/.docker/machine/. ~/consul/machine/ + docker service update worker --image $IMAGE ``` ] -(This command can be re-executed to update the copy.) +SwarmKit updates all instances at the same time. + +If only we could do a rolling upgrade! --- -## Install consulfs on another node +## Changing the upgrade policy + +- We can set upgrade parallelism (how many instances to update at the same time) -- We will repeat the previous steps to install consulfs +- And upgrade delay (how long to wait between two batches of instances) .exercise[ -- Connect to node2: +- Change the parallelism to 2 and the delay to 5 seconds: ```bash - ssh node2 + docker service update worker --update-parallelism 2 --update-delay 5s ``` -- Install `consulfs`: +- Rollback to the previous image: ```bash - docker run --rm -v /usr/local/bin:/target jpetazzo/consulfs + docker service update worker --image $DOCKER_REGISTRY/dockercoins_worker:v0.1 ``` ] --- -## Mount Consul +## Getting cluster-wide task information -- The procedure is still the same as on the first node +- The Docker API doesn't expose this directly (yet) -.exercise[ +- But the SwarmKit API does -- Create the mount point: - ```bash - mkdir ~/consul - ``` +- Let's see how to use it -- Mount the filesystem: - ```bash - consulfs localhost:8500 ~/consul & - ``` +- We will use `swarmctl` -] +- `swarmctl` is an example program showing how to + interact with the SwarmKit API -At this point, `ls -l ~/consul` should show `docker` and -`machine` directories. +- First, we need to install `swarmctl` --- -## Access the credentials from the other node +## Building `swarmctl` -- We will create a symlink +- I thought I would enjoy a 1-minute break at this point -- We could also copy the credentials +- So we are going to compile SwarmKit (including `swarmctl`) .exercise[ - -- Create the symlink: +- Download, compile, install SwarmKit with this one-liner: ```bash - mkdir -p ~/.docker/ - ln -s ~/consul/machine ~/.docker/ - ``` - -- Check that all nodes are visible: - ```bash - docker-machine ls + docker run -v /usr/local/bin:/go/bin golang \ + go get github.com/docker/swarmkit/... ``` ] ---- - -## A few words on this strategy - -- Anyone accessing Consul can control your Docker cluster -
(to be fair: anyone accessing Consul can wreck - serious havoc to your cluster anyway) - -- ConsulFS doesn't support *all* POSIX operations, - so a few things (like `mv`) will not work) +-- -- As a consequence, with Machine 0.6, you cannot - run `docker-machine create` directly on top of ConsulFS +(CONTAINERS!) --- -## What if Consul becomes unavailable? +## Using `swarmctl` -- If Consul becomes unavailable (e.g. loses quorum), -
you won't be able to access your credentials +- The Docker Engine places the SwarmKit control socket in a special path -- If Consul becomes unavailable ... -
your cluster will be in a bad state anyway +- And you need root privileges to access it -- You can still access each Docker Engine over the - local UNIX socket -
(and repair Consul that way) +.exercise[ + +- Set an alias so that swarmctl can run as root and use the right control socket: + ```bash + alias \ + swarmctl='sudo swarmctl --socket /var/run/docker/cluster/docker-swarmd.sock' + ``` +] --- -# Highly available Swarm managers +## `swarmctl` in action -- Until now, the Swarm manager was a SPOF -
(Single Point Of Failure) +- Let's review a few useful `swarmctl` commands -- Swarm has support for replication +.exercise[ -- When replication is enabled, you deploy multiple (identical) managers +- List cluster nodes (that's equivalent to `docker node ls`): + ```bash + swarmctl node ls + ``` - - one will be "primary" - - the other(s) will be "secondary" - - this is determined automatically -
(through *leader election*) +- View all tasks across all services: + ```bash + swarmctl task ls + ``` ---- +] -## Swarm leader election +--- -- The leader election mechanism relies on a key/value store -
(Consul, etcd, Zookeeper) +## Caveat -- There is no requirement on the number of replicas -
(the quorum is achieved through the key/value store) +- SwarmKit is vendored into the Docker Engine -- When the leader (or "primary") is unavailable, -
a new election happens automatically +- If you want to use `swarmctl`, you need the exact version of + SwarmKit that was used in your Docker Engine -- You can issue API requests to any manager: -
if you talk to a secondary, it forwards to the primary +- Otherwise, you might get some errors like: -.warning[Until recently there was a bug when -the Consul cluster itself had a leader election; -
see [docker/swarm#1782](https://github.com/docker/swarm/issues/1782).] +``` +Error: grpc: failed to unmarshal the received message proto: wrong wireType = 0 +``` --- -## Swarm replication in practice +# Centralized logging + +- We want to send all our container logs to a central place -- We need to give two extra flags to the Swarm manager: +- If that place could offer a nice web dashboard too, that'd be nice - - `--replication` +-- - *enables replication (duh!)* +- We are going to deploy an ELK stack - - `--advertise ip.ad.dr.ess:port` +- It will accept logs over a syslog socket - *address and port where this Swarm manager is reachable* +- We will deploy a logspout container on every node -- Do you deploy with Docker Machine? -
Then you can use `--swarm-opt` - to automatically pass flags to the Swarm manager +- Logspout will detect containers as they are started, and funnel their logs to logstash --- -## Cleaning up our current Swarm containers +# Setting up ELK to store container logs -- We will use Docker Machine to re-provision Swarm +*Important foreword: this is not an "official" or "recommended" +setup; it is just an example. We do not endorse ELK, logspout, +or the other elements of the stack more than others!* -- We need to: +What we will do: - - remove the nodes from the Machine registry - - remove the Swarm containers +- Spin up an ELK stack with services -.exercise[ +- Gaze at the spiffy Kibana web UI -- Remove the current configuration (remember to go back to node1!): - ```bash - for N in 1 2 3 4 5; do - ssh node$N docker rm -f swarm-agent swarm-agent-master - docker-machine rm -f node$N - done - ``` +- Manually send a few log entries over syslog -] +- Add logspout to send all container output to ELK --- -## Re-deploy with the new configuration +## What's in an ELK stack? + +- ELK is three components: -- This time, all nodes can be deployed identically -
(instead of 1 manager + 4 non-managers) + - ElasticSearch (to store and index log entries) -.exercise[ + - Logstash (to receive log entries from various + sources, process them, and forward them to various + destinations) -```bash - grep node[12345] /etc/hosts | grep -v ^127 | - while read IPADDR NODENAME; do - docker-machine create --driver generic \ - --engine-opt cluster-store=consul://localhost:8500 \ - --engine-opt cluster-advertise=eth0:2376 \ - --swarm --swarm-master \ - --swarm-discovery consul://localhost:8500 \ - --swarm-opt replication --swarm-opt advertise=$IPADDR:3376 \ - --generic-ssh-user docker --generic-ip-address $IPADDR $NODENAME - done -``` + - Kibana (to view/search log entries with a nice UI) -] +- The only component that we will configure is Logstash -.small[ -Note: Consul is still running thanks to the `--restart=always` policy. -Other containers are now stopped, because the engines have been -reconfigured and restarted. -] +- We will accept log entries using the syslog protocol + +- Log entries will be stored in ElasticSearch, +
and displayed on Logstash's stdout for debugging --- -## Assess our new cluster health +## Setting up ELK -- The output of `docker info` will tell us the status - of the node that we are talking to (primary or replica) +- We need three containers: ElasticSearch, Logstash, Kibana -- If we talk to a replica, it will tell us who is the primary +- We will place them on a common network, `logging` .exercise[ -- Talk to a random node, and ask its view of the cluster: +- Create the network: ```bash - eval $(docker-machine env node3 --swarm) - docker info | grep -e ^Name -e ^Role -e ^Primary + docker network create --driver overlay logging ``` -] +- Create the ElasticSearch service: + ```bash + docker service create --network logging --name elasticsearch elasticsearch + ``` -Note: `docker info` is one of the only commands that will -work even when there is no elected primary. This helps -debugging. +] --- -## Test Swarm manager failover +## Setting up Kibana -- The previous command told us which node was the primary manager +- Kibana exposes the web UI - - if `Role` is `primary`, -
then the primary is indicated by `Name` +- Its default port (5601) needs to be published - - if `Role` is `replica`, -
then the primary is indicated by `Primary` +- It needs a tiny bit of configuration: the address of the ElasticSearch service + +- We don't want Kibana logs to show up in Kibana (it would create clutter) +
so we tell Logspout to ignore them .exercise[ -- Kill the primary manager: +- Create the Kibana service: ```bash - ssh node`N` docker kill swarm-agent-master + docker service create --network logging --name kibana --publish 5601:5601 \ + -e LOGSPOUT=ignore -e ELASTICSEARCH_URL=http://elasticsearch:9200 kibana ``` ] -Look at the output of `docker info` every few seconds. - --- -# Highly available containers +## Setting up Logstash -- Swarm has support for *rescheduling* on node failure +- Logstash needs some configuration to listen to syslog messages and send them to elasticsearch -- It has to be explicitly enabled on a per-container basis +- We could author a custom image bundling this configuration -- When the primary manager detects that a node goes down, -
those containers are rescheduled elsewhere +- We can also pass the configuration on the command line -- If the containers can't be rescheduled (constraints issue), -
they are lost (there is no reconciliation loop yet) - -- In Swarm 1.1, this is an *experimental* feature -
(To enable it, you must pass the `--experimental` flag when you start Swarm itself!) - -- In Swarm 1.2, you don't need the `--experimental` flag anymore - ---- - -## About Swarm generic flags - -- Some flags like `--experimental` and `--debug` must be *before* the Swarm command -
(i.e. `docker run swarm --debug manage ...`) - -- We cannot use Docker Machine to pass that flag ☹ -
(Machine adds flags *after* the Swarm command) +.exercise[ -- Instead, we can use a custom Swarm image: - ```dockerfile - FROM swarm - ENTRYPOINT ["/swarm", "--debug"] +- Create the Logstash service: + ```bash + docker service create --network logging --name logstash \ + -e LOGSPOUT=ignore logstash -e "$(cat logstash.conf)" ``` -- We can tell Machine to use this with `--swarm-image` +] --- -## Start a resilient container - -- By default, containers will not be restarted when their node goes down - -- You must pass an explicit *rescheduling policy* to make that happen +## Checking Logstash -- For now, the only policy is "on-node-failure" +- Before proceeding, let's make sure that Logstash started properly .exercise[ -- Start a container with a rescheduling policy: +- Lookup the node running the Logstash container: + ```bash + docker service tasks logstash + ``` +- Log into that node: ```bash - docker run --name highlander -d -e reschedule:on-node-failure nginx + ssh ip-172-31-XXX-XXX ``` ] -Check that the container is up and running. - --- -## Simulate a node failure - -- We will reboot the node running this container - -- Swarm will reschedule it +## View Logstash logs .exercise[ -- Check on which node the container is running: -
`NODE=$(docker inspect --format '{{.Node.Name}}' highlander)` +- Get the ID of the Logstash container: + ```bash + CID=$(docker ps -q --filter label=com.docker.swarm.service=logstash) + ``` -- Reboot that node: -
`ssh $NODE sudo reboot` +- View the logs: + ```bash + docker logs --follow $CID + ``` -- Check that the container has been recheduled: -
`docker ps -a` +] +You should see the heartbeat messages: +.small[ +```json +{ "message" => "ok", + "host" => "1a4cfb063d13", + "@version" => "1", + "@timestamp" => "2016-06-19T00:45:45.273Z" +} +``` ] --- -## Reboots - -- When rebooting a node, Docker is stopped cleanly, and containers are stopped - -- Our container is rescheduled, but not started - -- To simulate a "proper" failure, we can use the Chaos Monkey script instead +## Testing the syslog receiver -```bash -~/orchestration-workshop/bin/chaosmonkey $NODE -``` +- In a new window, we will generate a syslog message ---- +- We will use the `logger` standard utility -## Cluster reconciliation +- We will run it in a service connected to the `logging` network -- After the cluster rejoins, we can end up with duplicate containers +- We don't want it to be restarted forever, so we will do that in a one-shot container .exercise[ -- Once the node is back, remove one of the extraneous containers: +- Send a test message: ```bash - docker rm -f node`N`/highlander + docker service create --network logging --restart-condition none debian \ + logger -n logstash -P 51415 hello world ``` ] --- -## .warning[Caveats] - -- There are some corner cases when the node is also - the Swarm leader or the Consul leader; this is being improved - right now! - -- The safest way to address for now this is to run the Consul - servers, the Swarm managers, and your containers, on - different nodes. - -- Swarm doesn't handle gracefully the fact that after the - reboot, you have *two* containers named `highlander`, - and attempts to manipulate the container with its name - will not work. This will be improved too. - ---- - -class: title - -# Conclusions - ---- - -## Swarm cluster deployment - -- We saw how to use Machine with the `generic` driver to turn - any set of machines into a Swarm cluster +## Connect to Kibana -- This can trivially be adapted to provision cloud instances - on the fly (using "normal" drivers of Docker Machine) +- The Kibana web UI is exposed on cluster port 5601 -- For auto-scaling, you can use e.g.: +- Open the UI in your browser: http://instance-address:5601/ - - private admin-only network - - - no TLS - - - static discovery on a /24 to /20 network (depending on your needs) + (Remember: you can use any instance address!) --- -## Key/value store - -- We saw an easy deployment method for Consul - -- This is good for 3 to 9 nodes +## "Configuring" Kibana -- Remember: raft write performance *degrades* as you add nodes! +- If you see a status page with a yellow item, wait a minute and reload + (Kibana is probably still initializing) -- For bigger clusters: +- Kibana should offer you to "Configure an index pattern": +
in the "Time-field name" drop down, select "@timestamp", and hit the + "Create" button - - have e.g. 5 "static" server nodes +- Then: - - put them in round robin DNS record set (or behind an ELB) + - click "Discover" (in the top-left corner) + - click "Last 15 minutes" (in the top-right corner) + - click "Last 1 hour" (in the list in the middle) + - click "Auto-refresh" (top-right corner) + - click "5 seconds" (top-left of the list) - - run a normal agent on the other nodes +- You should see a series of green bars (with one new green bar every minute) --- -## App deployment +## Visualizing container logs in Kibana -- We saw how to transform a Compose file into a series of build artifacts + - - using S3 or another object store is trivial +- We can customize the web UI to be more readable -- We saw how to programmatically add load balancing, logging +.exercise[ -- This can be improved further by using variable interpolation for the image tags +- In the left column, move the mouse over the following + columns, and click the "Add" button that appears: -- Rolling deploys are relatively straightforward, but: + - - I recommend to aim directly for blue/green (or canary) deploy + - logsource + - program + - message - - In the production stack, abstract stateful services with ambassadors +] --- -## Operations - -- We saw how to setup an ELK stack and send logs to it in a record time +## Setting up Logspout - *Important: this doesn't mean that operating ELK suddenly became an easy thing!* +- Logspout connects to the Docker control socket -- We saw how to translate a few basic tasks to containerized environments +- Using the Docker events API, it automatically detects new containers - (Backups, network traffic analysis) +- Using the Docker logging API, it streams logs of all containers to its outputs -- Debugging is surprisingly similar to what it used to be: +- We will run a logspout container on each node (using global scheduling), and bind-mount the Docker control socket into the logspout container - - remember that containerized processes are normal processes running on the host +.exercise[ - - `docker exec` is your friend +- Create the logspout service: + ```bash + docker service create --network logging --name logspout --mode global \ + --mount source=/var/run/docker.sock,type=bind,target=/var/run/docker.sock \ + -e SYSLOG_FORMAT=rfc3164 gliderlabs/logspout syslog://logstash:51415 + ``` - - also: `docker run --net host --pid host -v /:/hostfs alpine chroot /hostfs` +] --- -## Things we haven't covered +## Viewing container logs -- Per-container system metrics (look at cAdvisor, Snap, Prometheus...) +- Go back to Kibana -- Application metrics (continue to use whatever you were using before) - -- Supervision (whatever you were using before still works exactly the same way) - -- Tracking access to credentials and sensitive information (see Vault, Keywhiz...) - -- ... (tell me what I should cover in future workshops!) ... +- Container logs should be showing up! --- -## Resilience - -- We saw how to store important data (crendentials) in Consul +## Controlling Docker from a container -- We saw how to achieve H/A for Swarm itself - -- Rescheduling policies give us basic H/A for containers +- In a local environment, just bind-mount the Docker control socket: + ```bash + docker run -ti -v /var/run/docker.sock:/var/run/docker.sock docker + ``` -- This will be improved in future releases +- Otherwise, you have to: -- Docker in general, and Swarm in particular, move *fast* + - set `DOCKER_HOST`, + - set `DOCKER_TLS_VERIFY` and `DOCKER_CERT_PATH` (if you use TLS), + - copy certificates to the container that will need API access. -- Current high availability features are not Chaos-Monkey proof (yet) +More resources on this topic: -- We (well, the Swarm team) is working to change that +- [Do not use Docker-in-Docker for CI]( + http://jpetazzo.github.io/2015/09/03/do-not-use-docker-in-docker-for-ci/) +- [One container to rule them all]( + http://jpetazzo.github.io/2016/04/03/one-container-to-rule-them-all/) --- -## What's next? - -- November 2015: Compose 1.5 + Engine 1.9 = -
first release with multi-host networking - -- January 2016: Compose 1.6 + Engine 1.10 = -
embedded DNS server, experimental high availability +## Bind-mounting the Docker control socket -- April 2016: Compose 1.7 + Engine 1.11 = -
round robin DNS records, huge improvements in HA +- In Swarm mode, bind-mounting the control socket gives you access to the whole cluster -- Next release: another truckload of features - -- I will deliver this workshop about twice a month - -- Check out the GitHub repo for updated content! -
(there is a tag for each big round of updates) --- -## Overall complexity - -- The scripts used here are pretty simple (each is less than 100 LOCs) - -- You can easily rewrite them in your favorite language, -
adapt and customize them, in a few hours of time +# Last words -- FYI: those scripts are smaller and simpler than the - scripts (cloud init etc) used to deploy the VMs for this - workshop! +- You can leave Swarm mode with `docker swarm leave` -- Docker Inc. has commercial products to wrap all this: +- `docker inspect` is being extended to support services, tasks... - - Docker Cloud -
(manage your Docker nodes from a SAAS portal) +- Healthchecks - - Docker Datacenter -
(buzzword-compliant management solution: -
turnkey, enterprise-class, on-premise, etc.) +- Bundles ---