Written by royalmo (Eric Roy).
This lab session is part of the ICT Systems Engineering program at EPSEM (UPC Manresa). You can check more content here.
- Introduction
- Docker installation
- What is Docker?
- What is docker-compose?
- Docker networking
- A complete exercise
- More information
This Lab Session will introduce you into the Docker environment, so that by the end of it you have been able to configure the network as your will in some Docker containers, as well as understanding basic Docker concepts.
You don't need to know anything about Docker, nor read any documentation, as everything needed is provided here (and the work you did in the previous lab sessions). However, to achieve a complete understanding, it is recommended to extend your knowledge by reading some documentation and understanding the source code that we will be using.
You need to deliver a PDF document containing an explanation of all the tasks of this document. Note that you don't need to document the exercises, as they're normally steps from the tutorial. Nevertheless, if you don't complete them, you won't be able to follow this tutorial and the tasks.
Docker can be installed in any machine, regarding of its OS. However, it's recommended to use a GNU/Linux based OS, as this lab session has not been fully tested on other Systems.
You can follow the updated installation tutorial at https://docs.docker.com/get-docker/. Follow the steps to install Docker Desktop on your machine. If you use GNU/Linux, you simply need to install it as every application:
$ sudo apt-get install docker docker-compose
If you installed it via apt, you may not have the Desktop interface, but in this tutorial we will only use the CLI (Command-Line Interface).
EXERCISE 1
Install docker and docker-compose in your system. Note that docker-compose is included in the Docker Desktop application.
Once installed, and before you continue with this lab session, check that you have everything installed. Here is what you should look for (Note that docker needs the client and the server packages to work correctly).
$ docker version
Client: [...]
Version: 20.10.22
[...]
Server: [...]
Engine:
Version: 20.10.22
[...]
$ docker-compose version
docker-compose version 1.17.1, build unknown
[...]
Versions may change as time goes by, but as long as the CLI APIs don't change, this tutorial should work correctly.
EXERCISE 2
Check that your Docker installation was successful.
Docker Hub is where Docker images are stored (as public or private images). In this lab session we will need to pull some public images from there. You shouldn't need to create an account in order to pull them, but Docker policies could vary in the future.
Creating a Docker Hub account is the first step for getting more into Docker. It's free, and will let you access more features (that aren't needed for this lab session).
EXERCISE 3 (Optional)
Create a Docker Hub profile and log in to your CLI using
docker login -u <username>
.
Once everything is installed, you may have encountered a problem:
every docker and docker-compose command requires superuser permissions
(i.e. sudo
). Now, you can choose between 3 alternatives:
-
You can still use
sudo
in every command. It isn't recommended as it will be painful, but if you're lazy or in a hurry it's still possible. -
You can run a root terminal with
sudo -i
, and there, any docker commands won't require that privilege escalation. -
Last but best, you can add your user into the docker group. Once this is done, you won't need to run anything with sudo anymore. There's an official tutorial on how to achieve this in this link.
It's highly recommended to follow the tutorial: you will only need to run 3 commands and everything will be much easier.
EXERCISE 4 (Optional)
Set up your docker environment so you don't need to run every command with
sudo
.If you followed the tutorial but get an error with some
docker.sock
file, runsudo chmod 666 /var/run/docker.sock
(+ info).
The best way to define what is Docker, is to present the reason why it exists.
Let's suppose the case of a computer that needs to run 5 different applications.
Each application needs its dependencies, and there may be incompatibilities between the versions required for each app. One solution, using only one physical machine, is to create 5 virtual machines and run one application in every machine. That way, they run in a isolated world, where they have their own dependencies and (maybe) different Kernel and OS.
However, it's hard to maintain a virtual machine: we would need to treat it as a personal computer: update all the OS, long boot and shutdown times. But mostly, the resources of the physical machine would need to be shared out in a fixed size, meaning that at some point application 1 wouldn't need all its resources and application 2 wouldn't keep up with the given resources.
Here's where Docker comes into our rescue. There're a lot of features that the application doesn't need to run (i.e. a Desktop environment). It is possible to create a "virtual machine" with just the necessary libraries for a desired application. This will reduce resources and boot and shutdown times.
But that's not all: instead of booking some physical resources, let's run each application in a host kernel's process. That way, resources can be shared by the host machine. And this is what does Docker.
There are some words that are constantly repeated among Docker documentation and this document. Let's define them in a few words:
-
Host machine: The physical computer that will be running the containers.
-
Dockerfile: The file that describes an image. Using an analogy, we could say that the Dockerfile is the Makefile of a project and the image is the executable file. It contains the instructions needed to recreate the image.
You have an example of a
Dockerfile
in the repository of the lab session. -
Image: As said before, this can be compared to an executable file. When an image is run, a container is created.
-
Container: A container is a kernel process that runs an image. As you can imagine, an image can be executed multiple times at once. The changes made in a container will not affect the image. This means that you can mess around with a container, that other containers or the image will remain intact, and once you stop that container you will loose all changes.
Here is a list of the most basic commands that you may need in the first tasks:
-
docker build -t <image_name> <Dockerfile_folder>
will run theDockerfile
that's in the given folder, and save the image with the given name. -
docker run [-it] <image_name>
will create a container of the image given. If the image runs an interactive application (i.e. bash), you will need to specify it with the-it
flags.Docker implemented a feature that, if the image specified doesn't exist in the host machine, it will look for it in the Docker Hub repositories, download it and run it. This will save us (in our case) every
docker build
anddocker pull
command. -
docker ps
lists the active containers and some information about them. -
docker image ls
will list all locally saved images. You can remove them to save some disk space. -
docker system prune -a
will delete everything you created using docker commands: networks, images, rules... this can be a good idea if you don't know what you just did. There are less heavy ways to fix things, but this will work every time.
There are a ton of commands, and some of them will be explained along this document, but if you wish to have a cheat sheet you can check the official documentation.
Now that you have some knowledge about Docker, let's play a little bit with it.
EXERCISE 5
Run
docker run -it ubuntu
. The container's bash prompt will be displayed over the current terminal. In another terminal rundocker ps
to check the state of that terminal.
As you can see, the ubuntu
image (_/ubuntu)
contains a basic ubuntu distribution (the version could be specified i.e.
docker run -it ubuntu:18.04
), with the most basic commands. Having only
the essential programs permits a lightweight image and faster boot times.
You can see that we are in another machine because the terminal prompt has a
different host machine (remember: user_name@computer_name
).
TASK 1
In that container, check which commands do work properly:
emacs test.txt
echo hola
nano test.txt
vi test.txt
ping google.com
ip a
sudo echo hola
Did some of the results surprised you? Will it be able to perform tasks with these applications?
As you may imagine, we need to install some packages to run our applications. For example, if we want to virtualize our Python script, we will need to install it.
EXERCISE 6
Run
apt list
and check if thenano
package is listed blow. Remember that this list contains all the installable and installed packages.Hint: use
grep
to find the nano package faster.Hint 2: using a pipeline you may get a warning. It should affect anything, but you can hide it by suppressing apt's stderr output (
2> /dev/null
).To compare, you can check if the
apt
package appears in that list.
The basic Ubuntu image comes with the smallest package list possible, again, to use less resources. But in order to install a package, we need to add that entry to the package list. Do you remember how we update the package list?
TASK 2
Try to install nano (
apt install nano
). Did it work? Update the package list withapt update
. Once this done, checkapt list
. Do you see it now?Install it and try the nano command. Did it work?
Now we will verify the persistance of a container.
TASK 3
Stop the current container by exiting its bash terminal. Check that nothing appears in
docker ps
, but the Ubuntu image still appears indocker image ls
.Run the container again, and check if
nano
is still installed. Did you expect that to happen? Why?
It will be painful to install everything every time we restart our container.
That's why we can create Dockerfiles that build an Ubuntu (or whatever) image
but with added applications. In fact, ubuntu
is an image that sits on top of
the Docker's kernel (a special Linux kernel).
To make your life easier, an image has been already prepared for you. It's called royalmo/docker-networks. You can find it's Dockerfile in this repository. It may be recommended to read it in order to understand what contains the image, but it isn't mandatory.
TASK 4
Run the
royalmo/docker-networks
image, and check which of the commands that didn't work in the first task now work. Do you have all the needed tools to configure the network?
Docker-compose is an extension to the docker environment that makes it easier to manage multiple containers at once, and the connections between them. In our case, we will use it to create multiple containers at once, and to connect them as we wish.
The docker-compose
command will work only if it's executed in a folder where
a file named docker-compose.yml
exists. We will not talk too much on what
needs this file, but rather on how to use it. Here you have a continuation of
the previous command list.
-
docker-compose up [-d]
will build images and run all containers. Use-d
if some containers are interactive. This way it will run in background and you will be able to attach to the terminals later. -
docker-compose down
will stop and delete all containers. -
docker-compose ps
will show the state of the containers created. Please note that this command does a different thing thandocker ps
: docker-compose displays the services of the currentdocker-compose.yml
(if they're up), and docker will show only the running containers. -
docker attach <container_name>
will attach to the terminal, if possible, of the given container name.<container_name>
must exist indocker ps
. -
docker exec <container_name> <command>
will execute a command in a given container.<container_name>
must exist indocker ps
.Although not used in this document, it's important to understand the difference with
docker attach
. -
docker-compose restart <service_name>
will restart a single service that may be stopped indocker-compose ps
. A service can be stopped due to an error, the user exiting the shell, or withdocker stop
. -
If you found an image not found error when you run
docker-compose up
, it will probably be because you didn't rundocker-compose down
. A docker-compose file can not be run multiple times. If nothing is displayed indocker-compose ps
, you're good to run up.
In this document we will always use docker attach
, but let's introduce
ourselves to the typical Docker scenario.
A Docker container will (almost) always be an application doing something. These types of applications will normally be servers: web servers, databases, ... something that will return a response after an input (normally sent via network) is received.
The main application of those containers will be a daemon of that server (a
process that runs infinitely). In this case, we will clearly see the difference
between docker attach
and docker exec
:
-
Imagine we need to access the server console (e.j. a Minecraft server) to ban a player. With
docker attach
, we will attach to the java executable, and communicate with the server. In other words, withdocker attach
we communicate with the main process. Note that not all main processes are interactive, thusdocker attach
may not work in every container. -
However, we may need to modify a backup script on that server. For this purpose, we could run
docker exec -it <container_name> bash
and we would land in a bash terminal in that container, as if it was SSH. There we can create, delete, and modify all the files we need. In other words, withdocker exec
we communicate with the container, not any process.
In this document, we don't have any server in any container, as we don't need
them. That's why we placed bash as the main process, thus we can attach
to
it instead of exec
a new bash process. But keep in mind that
this isn't the general rule and how Docker is meant to be used.
To sum up this section, let's practice a little bit. We will work with a basic
(but not minimal) docker-compose.yml
that will contain this text:
version: "3.5"
services:
node1:
image: royalmo/docker-networks
hostname: node1
tty: true
stdin_open: true
node2:
image: ubuntu:22.04
hostname: node2
tty: true
stdin_open: true
EXERCISE 7
Create a folder
test
and a file inside of itdocker-compose.yml
with the text provided above.
This is all the setup we need to do. Now, let's play with it!
TASK 5
Run the
docker-compose
setup in background (extra: what happen when you don't add the-d
flag?). Check the status and the containers names withdocker-compose ps
. What are its states and container names?What would be the command to attach to a terminal?
Once the containers opened, we can check if everything works fine.
EXERCISE 8
Open another terminal with 2 tabs (in bash you can create a tab with Ctrl+Shift+T) and attach a container per tab. You can check that the bash prompt is different for each container.
Create a different file in each container with
touch
.
The last task of this section will be to restart a stopped container.
TASK 6
Stop a container by exiting its shell. Does it appear now in
docker ps
? And indocker-compose ps
? Do we have further info?Restart only that container, and attach to it again. Does the file still exist?
What would happen if instead of restarting, you stop all containers with
docker-compose down
and restart them? You can verify your answer by trying it.
Docker has a very complex but easy-to-understand network system in order to permit containers to communicate with the outer world. When installed, it creates it's own isolated environment.
The main and default Docker network is called bridge
, and as it name tells,
it acts like a bridge between the containers and the host. New containers are
assigned with that address range. This means that, if we enable a NAT router on
our host machine, they will have internet access.
TASK 7
On your host machine, run
ip a
andip routes
. You will see that Docker added some devices during its installation. What IP ranges do they use? Does your host machine know how to go to them (i.e. do they have it's own route entry)?Now check the current POSTROUTING table (
sudo iptables -vL -t nat
, and check the POSTROUTING section). Do you see something that could be doing a NAT router for the Docker containers?
The bridge network type is only one of the five types that Docker has. Here you have a small description of every type:
-
BRIDGE: This is the most common network type, and the default for every new container. Every new network is simply a new sub-range of IP addresses inside the default
172.0.0.0/8
IP range. Of course, the addresses can be manually set. The bridge network also connects the containers with the host machine, as it also has an IP (which is normally the gateway). -
HOST: The host network simply runs the container with the network of the host. This means that it has the same IPs and routes. It's like we were running the application on the host machine, but the libraries and packages are isolated.
-
NONE: This type of Docker network specifies that a container must not have any network attached into it. This can become very useful to protect our applications, so they don't have any external connections. We haven't mentioned it in this document, but Docker provides other inter-container communication methods, so we would still be able to control it.
-
OVERLAY: This last network type is a little bit special. A swarm network is a "virtual network" that is independent of the physical devices. This means, that multiple devices can, together, have a single overlay network. This method becomes very useful when using distributed servers, and can be compared to Kubernetes. By default, this networks are allocated in the
10.0.0.0/8
IP range.We will just use this network type for educational purposes in a non-standard way, as we will have multiple overlay networks in a single physical computer. To be able to create overlay networks in this edge case, we need to initialize a special Docker feature: swarm. It's explained in the next section.
-
MACVLAN: The macvlan network creates a new connection to the host computer's main network. As this last sentence wasn't easy to understand, let's see an example. But don't worry, we won't see this in the exercises and tasks.
Imagine your laptop is connected to the internet through WiFi (or Ethernet) with the IPv4
192.168.1.5
. If a new container is attached to the macvlan network, Docker will "try to connect a new device to your router", so that, for example, the new container's IPv4 address is192.158.1.7
. You can set it up to use DHCP or a manual configuration.
EXERCISE 9
Run
docker network ls
to see the default docker networks and its names. When a new network is created, you will see it here.Note: From now on, you'll be creating (directly or indirectly) a lot of Docker networks, so the output of
docker network ls
can become a nightmare. You will be able to delete all unused networks (except for the default ones) withdocker network prune
.
To use overlay networks, we need to have swarm. It's the Kubernetes of Docker: it manages the containers between the different hosts.
For example, if we make a cluster of 2 hosts and 5 containers need to be executed between those hosts, swarm will decide where to run each container, depending on the resources needed, to make both hosts have approximately the same amount of work.
In our case, we only have one host, but we still need to initialize swarm, as
overlay networks work on top of it. As you can imagine, the command to do so
is docker swarm init
(don't run it yet). You need to do this only once: if
you already did this, you will get an error saying that
this node is already part of a swarm.
Warning: docker swarm is more strict than your computer: you may have more than one IPv4 and IPv6 address in your network interface; it's fine with the normal usage of Ubuntu. However, swarm requires a single address (you can have an IPv4 and an IPv6, but not two IPv4), meaning that you may get an error if you have more than one. You can specify only one address, but the easiest way is to delete that extra one.
If you're interested in practicing a little bit with swarm, you can check out this tutorial.
As in a normal case we don't need any complex networking, Docker does a lot of default things for us. For example, when a network is created, an IP range is assigned, and when containers are added to it, an IP is automatically assigned. Imagine how painful will it be to always connect to the network (with a different IP) every time you start a new container!
Docker provides a feature to inspect a network. This can help us when
debugging, and will show us relevant information. The magic command is
docker network inspect <network_name>
. This will result in a JSON object being
printed in our terminal (we could also save it to a file for better reading).
It will contain the IP range, the connected containers and its IPs, and other
metadata.
TASK 8
With the compose file from the previous section running, inspect all networks available and note down each container's IP address. Verify your answers by running
ip a
on every container.
Docker also sets up the routes to all neighbors at the start of every new container. This means that, if two containers are on the same network, they will be able to ping each other without prior configuration.
But that's not all, Docker also provides a "DNS server", so you can put the container name instead of its IP address, and it will replace it. However, this feature may not work if we change some network settings on-the-go (and surprise, this is what we will do!).
TASK 9
Using the
ping
command (you can use the container's name instead of its IPv4), perform these tasks:
Check with
ip route
if the containers can communicate to each other.Can they also communicate with the host machine? If so, which IP must they use to reach the host machine?
Can all containers reach Internet (i.e. google.com)? Did you expect that?
Now set the host's ip_forward flag to 0 (docker sets it to 1 by default) with
sudo sysctl -w net.ipv4.forward=0
. Can now the containers ping to google? Did you expect that to happen?Remember to set the ip_forward bit to 1 at the end of this task.
If you look to some Docker tutorials, you may see that the only network-related thing is to expose a port. Now that we know how docker connects each container, what does this do? The answer is really simple: it adds a rule to the host's iptables.
For example, "bind any inbound packets coming from WiFi and port 80 to the
docker container with IP X
and the port 3000". As you can imagine, we could
do this by ourselves, but Docker does it for us.
EXERCISE 10
Check the NAT table in iptables (same command as before). Save the output.
Now run
docker run -it -p 100:3000 ubuntu
, and in another terminal check the iptables again.Compare both outputs. Do you see somewhere a new port-forwarding rule? Can you imagine what does the
-p
flag do, without reading the documentation?Extra: add the conclusions of this exercise in the report.
A good way to learn all the network stuff is to play with it. For this, you will need some new docker commands. Here are some examples that you may need:
-
docker network create -d bridge --subnet 172.0.56.0/24 new_network_name
will create a new network of type bridge and the specified subnet and name. -
docker run -it --network new_network_name royalmo/docker-networks
will run an image in a container with a specified network.
As you can see, Docker asks us to specify the subnet of a network. With a physical Switch, this setup doesn't need to be done, but as we are working in a virtualized environment, we need to add this extra information. It's done this way so Docker can assign IPs automatically, and to add an extra security barrier.
If you need to create a network with more options, check out this reference manual.
TASK 10
Imagine that we have a physical computer (A) and another physical computer (B), both connected to the same LAN (i. e. (A) and (B) must be able to ping each other).
- There is a bridge network in (A) with subnet
10.250.45.0/24
.- (A) has a single container (with image
royalmo/docker-networks
) connected to that bridge network.- (B) has the route
10.250.45.0/24 via <A's IP>
Try (if possible) and justify the answers of these questions:
- Will (A)'s container be able to ping (B)?
- Will (B) be able to ping (A)'s container?
- Indeed, (A) implements a NAT router. Does it help protecting (A)'s containers from external attacks?
Now that you know the essentials about docker, docker-compose and docker networking, its time to apply it to simulate a real-life network.
A docker-compose.yml
file has been prepared just for you. You don't need to
understand what's inside of it, but now that you know a lot about Docker it
could be an interesting exercise. You can see the file in this lab session's
repository or in
this direct link.
From now on, let's consider a node a container of our compose file. It will simulate a device on our virtual network.
Disclaimer: this compose file uses overlay networks, that need swarm setup. You may find this a little bit overkill, and it is.
If you wish to not use swarm, there's a workaround just for you: you can replace all overlay networks with bridges. All will work the same way, but now the host will act as a network superuser: it will be able to connect to every node, and every node will have Internet access. However, some nodes won't be able to communicate with each other right away.
You will then need to change a little bit the statements of the following tasks. You don't need to connect each node to the internet, but you can route all default requests through another interface, for example, and check if everything works as expected using tcpdump.
EXERCISE 11
Download the docker-compose file in a new folder called
docker_networks
and run it.This file will need to create overlay networks, so remember to initialize them the first time only with
docker swarm init
. Some warnings about swarm may appear while starting everything, telling that there aren't more hosts. It should be a problem in a normal case, but it's fine for us.Once everything is up and running, attach to every node. Remember to use bash's tabs or another fancy terminal for ubuntu like tilix.
The nodes are connected with overlay networks. To prevent some headaches, a schema is provided to you:
In the repository you will find a .gv
file with the source code of this
schema. You will find it useful for the next task.
TASK 11
Fill the graph with the subnets, container's IPs and container interface's names. That way, you will have all the names and IP needed at a single place.
Warning! Every time you restart the docker-compose, all addresses, ranges and configurations done on the nodes will be lost. It's recommended to do tasks 11, 12 and 13 as a pack, so you can use the results from the previous tasks. Pick a day you have plenty of time!
Add the updated graph to the report.
Now that we are familiarized with our node network, let's check its initial state.
EXERCISE 12
Check if node1 has internet access (with a ping to google.com). Now check it for any other node. Try also to ping node1 from node2, and a node7 from node2.
As you can see, there is a lot to do. But as we did this in previous lab sessions, it should be done faster than expected.
However, you can see that there are multiple ways to reach some nodes. Be careful with how you set up some routes, you may end up sending the packets into an infinite loop (and loose them due to TTL)!
We need to take account of another thing: Docker created a NAT router only for the bridge network's subnet. We need to add the overlay networks' IPs into that NAT. As the following tasks are a little long, here you have the magic command:
sudo iptables -t nat -A POSTROUTING ! -o <net0> -s 10.0.0.0/8 -j MASQUERADE
Replace <net0>
in the last command with net0 interface's name that you
can see when running ip a
in the host computer (i.e. br-c39dadf0293e
).
You will have to delete the rule manually by the end of the lab session.
TASK 12
By only changing the packet forwarding bit and the routing tables of every node (and maybe the host?), make that every node can communicate to every node and has internet access (remember to add the POSTROUTING route on the host).
Explain the criteria you used to set up the routes (i.e. why did you chose to send packets through that way and not another).
You can check the results by repeating Exercise 11. Run also tcpdump on some nodes to check that the packets are going through the desired route.
If you finished the last task and all worked, congratulations! You finished this lab session and learned something (hopefully).
For those that didn't have enough, an extra task has been prepared for you. As with the Task 12, it's something you did in other lab sessions. Have fun!
TASK 13 (Optional)
Set up the node2 as a NAT router. Re-evaluate the routes (now some nodes won't be able to reach other nodes directly).
Verify that all works as expected by looking at ports and IP addresses in tcpdump's output of the correct nodes.
If you wish to understand the files and commands used in this document, you can check the official documentation, or look for some tutorials:
As you can see, all of the settings can be done in a Dockerfile. This means
that we could create a docker-compose file with the correct images, such that
when we run docker-compose up
, it starts all the nodes and sets up the
routes automatically.
But that is too much for a lab session, do it on your own and only if you want! ;)
There's a cool GitHub repository I haven't tried, but it looks promissing. It's LeoVerto/docker-network-graph. It creates a Docker network graph automatically, so it may be worth giving it a try!
This project is under the GPL v3 license. Please give credit if you wish to use it somewhere else. Feel free to ask me before getting in trouble with the law!