# Working with Containers

![Status](https://img.shields.io/static/v1.svg?label=Status&message=Finished&color=brightgreen)
[![Source](https://img.shields.io/static/v1.svg?label=GitHub&message=Source&color=181717&logo=GitHub)](https://github.com/particle1331/ok-transformer/blob/master/docs/nb/dk/00-containers.ipynb)
[![Stars](https://img.shields.io/github/stars/particle1331/ok-transformer?style=social)](https://github.com/particle1331/ok-transformer)

---

## Introduction

Docker at its core solves the problem of installing dependencies across different machines. This along with reproducibility is a central concern in deploying machine learning systems. To work with Docker, we have to understand the concepts of **images** and **containers**. Docker after all can be thought of as an entire ecosystem around creating containers from images and running containers. This notebook is adapted from the first three sections of [this course](https://www.udemy.com/course/docker-and-kubernetes-the-complete-guide/).

```{figure} diagrams/00-dockerfix.jpeg
---
name: dockerfix
---
Docker solves the problem of installing software consistently.
```

### OS Kernel

To understand containers, we give a quick overview of **operating systems** (OS). Most OS has a **kernel** which runs software process that governs access between all programs and physical hardware connected on your computer. the kernel acts as an intermediate layer between running processes and hardware through system calls. Docker is able to isolate resources using namespaces and set usage and prioritization limits using control groups which are features of the Linux kernel. See this [blog post](https://www.nginx.com/blog/what-are-namespaces-cgroups-how-do-they-work/) for more details.

```{figure} diagrams/00-osarch.svg
---
name: osarch
---
OS architecture.
```

Note that the host computer does not necessarily have a Linux OS, what happens is that Docker sets up a Linux **virtual machine** (VM) inside the host computer. This can be seen from below where the OS of the running Docker server is shown. Here we have [Docker Desktop](https://www.docker.com/products/docker-desktop/) running on the background on a macOS laptop.

In [4]:
!docker version

Client:
 Cloud integration: v1.0.29
 Version:           20.10.21
 API version:       1.41
 Go version:        go1.18.7
 Git commit:        baeda1f
 Built:             Tue Oct 25 18:01:18 2022
 OS/Arch:           darwin/arm64
 Context:           desktop-linux
 Experimental:      true

Server: Docker Desktop 4.14.1 (91661)
 Engine:
  Version:          20.10.21
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.18.7
  Git commit:       3056208
  Built:            Tue Oct 25 17:59:41 2022
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.6.9
  GitCommit:        1c90a442489720eec95342e1789ee8a5e1b9536f
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0


### Hello world

The following example demonstrates building and running a container:

In [2]:
!docker run hello-world

Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world

[1Be35b49f5: Pull complete 208kB/3.208kBB[1A[2KDigest: sha256:6e8b6f026e0b9c419ea0fd02d3905dd0952ad1feea67543f525c73a0a790fefb
Status: Downloaded newer image for hello-world:latest

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (arm64v8)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.doc

The above message tells the entire process of how the `hello-world` container eventually is able to run on our machine. The image was pulled on [Docker Hub](https://hub.docker.com/) which is a registry of Docker images. Note that the creation of images occurs locally since the local machine is also our compute layer. The container proceeds to run its default command which is to execute the `/hello` program which prints the message on the terminal. The `hello-world` image produces a minimal container whose sole purpose is to print this message.

```{figure} diagrams/00-dockerhub.svg
---
name: dockerhub
---
Running `hello-world` on our local machine. 
```

It would be significantly faster to run this image a second time since Docker uses a **cache** of it. This makes sense since multiple containers are usually created from the same image. The architecture of an image and a container is shown on the following diagram.

```{figure} diagrams/00-helloworld.svg
---
name: helloworld
---
Anatomy of a Docker image and the resulting `hello-world` container in the context of the Linux kernel. Note the specific partition on the hard disk for the file system of the image.  
```

Here we see that an **image** is essentially a file system snapshot with startup commands. This can be thought of as a read-only template which provides the daemon a set of instructions for creating a container. A **container** on the other hand is a running process in the machine in the Linux VM with partitioned hardware resources allocated by the kernel.

## Manipulating containers using the Docker Client

In this section, we take a deeper look at commands available in the Docker client. First, we run an `ubuntu` image which is quite a bit more complex than `hello-world`. This runs `bash` by default then exits, which makes it look like nothing happens after the image is downloaded from Docker Hub:

In [7]:
!docker run ubuntu

Unable to find image 'ubuntu:latest' locally
latest: Pulling from library/ubuntu

[1BDigest: sha256:9a0bdde4188b896a372804be2384015e90e3f84906b750c1a53539b585fbbe7f[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K
Status: Downloaded newer image for ubuntu:latest


Instead, we can override the default command using some other command such as `ls`. Note that this command works because `ls` is a program that exists in the `ubuntu` image. The familiar program `echo` also exists, so we can also test that.

In [10]:
!docker run ubuntu ls

bin
boot
dev
etc
home
lib
media
mnt
opt
proc
root
run
sbin
srv
sys
tmp
usr
var


In [11]:
!docker run ubuntu echo 'hello, world!'

hello, world!


```{figure} diagrams/00-ubuntu.svg
---
name: ubuntu
---
Overriding the default command of `ubuntu`. Running `ls` instead of `bash`. 
```

To list all running containers we can use the following command. This will be very useful for determining the ID of a running container when we want to issue commands on specific containers.

In [15]:
!docker ps

CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES


All our previous containers immediately exit after running so that there is no running container, to see all containers we use the `--all` flag. Note that we have multiple containers (distinct IDs) for the same `ubuntu` image when we called run twice. Also, the [exit status](https://docs.docker.com/engine/reference/run/#exit-status) are all zero which means no errors were encountered:

In [17]:
!docker ps --all

CONTAINER ID   IMAGE         COMMAND                  CREATED         STATUS                     PORTS     NAMES
0edf99e3e81f   ubuntu        "echo 'hello, world!'"   2 minutes ago   Exited (0) 2 minutes ago             bold_euclid
dce17437356d   ubuntu        "/bin/bash"              2 minutes ago   Exited (0) 2 minutes ago             exciting_murdock
4eabe9b34275   hello-world   "/hello"                 2 minutes ago   Exited (0) 2 minutes ago             lucid_merkle


### Container lifecycle

From above we saw that running the same image twice resulted in two distinct containers. Docker run is actually identical to two separate processes: **create** and **start**. Creating an image sets up its file system while starting executes its default startup command. It follows that to start a container we have to point to a specific container ID:

In [23]:
!docker start -a 0edf99e3e81f

hello, world!


The `-a` flag is to attach the container to the terminal so we can view its output. Alternatively, we can use `docker logs <id>` to see all the logs from the container. We can see that the existing `0edf99e3e81f` container exited more recently:

In [19]:
!docker ps --all

CONTAINER ID   IMAGE         COMMAND                  CREATED          STATUS                      PORTS     NAMES
0edf99e3e81f   ubuntu        "echo 'hello, world!'"   12 minutes ago   Exited (0) 15 seconds ago             bold_euclid
dce17437356d   ubuntu        "/bin/bash"              12 minutes ago   Exited (0) 12 minutes ago             exciting_murdock
4eabe9b34275   hello-world   "/hello"                 13 minutes ago   Exited (0) 13 minutes ago             lucid_merkle


Note that we cannot modify the startup command of a created container. This is important: startup command can only overridden at container creation. As mentioned, we can create a new container without starting it. Here the container has the default `/hello` command. The status of a newly created container is `Created` instead of the usual `Exited`:

In [20]:
!docker create hello-world

a1f10af23bb8e226d338353929f4e279c8f061f1b09206d95ac135a17f5caa19


In [21]:
!docker ps --all

CONTAINER ID   IMAGE         COMMAND                  CREATED          STATUS                      PORTS     NAMES
a1f10af23bb8   hello-world   "/hello"                 11 seconds ago   Created                               gracious_mclaren
0edf99e3e81f   ubuntu        "echo 'hello, world!'"   14 minutes ago   Exited (0) 2 minutes ago              bold_euclid
dce17437356d   ubuntu        "/bin/bash"              14 minutes ago   Exited (0) 14 minutes ago             exciting_murdock
4eabe9b34275   hello-world   "/hello"                 14 minutes ago   Exited (0) 14 minutes ago             lucid_merkle


### Stopping containers

Let us create a container that runs for a long time (e.g. forever). Here we use `busybox` which combines tiny versions of many common UNIX utilities into a single small executable. Running `busybox` on our machine only used about 4 MB compared to 60+ MB for the `ubuntu` image. Also, the `ping` command does not exist on `ubuntu`. 

In [69]:
!docker create busybox ping google.com

84655b09dc1c9bc48176be84154948bf24f2254e9a3c7951d3b898b897ab7eb5


In [70]:
!docker start 84655b09dc1c9bc48176be84154948bf24f2254e9a3c7951d3b898b897ab7eb5

84655b09dc1c9bc48176be84154948bf24f2254e9a3c7951d3b898b897ab7eb5


Notice the up status:

In [71]:
!docker ps --all

CONTAINER ID   IMAGE     COMMAND             CREATED         STATUS        PORTS     NAMES
84655b09dc1c   busybox   "ping google.com"   5 seconds ago   Up 1 second             reverent_mahavira


To stop containers we use either `docker stop` or `kill`. The `stop` command sends a SIGSTOP to the running process. This gives 10 seconds for cleanup, then a fallback SIGKILL is sent to immediately terminate the process. Indeed, the stop command below takes 10.5 seconds while a kill command takes less than a second.

In [72]:
!docker stop 84655b09dc1c

84655b09dc1c


For some reason this gets nonzero exit status:

In [73]:
!docker ps --all

CONTAINER ID   IMAGE     COMMAND             CREATED          STATUS                                PORTS     NAMES
84655b09dc1c   busybox   "ping google.com"   25 seconds ago   Exited (137) Less than a second ago             reverent_mahavira


### Interacting with containers

In this section, we will use the [Redis](https://redis.io/) services from the `redis` image. This is an in-memory data store that is useful as a database and cache engine. Like `ubuntu` this container has multiple programs installed. In particular, there are two main commands that are interesting for us `redis-server` and `redis-cli` for obvious reasons. Here we run `redis` on the background:

```bash
$ docker run redis
1:C 26 Feb 2023 20:58:33.462 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 26 Feb 2023 20:58:33.462 # Redis version=7.0.8, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 26 Feb 2023 20:58:33.462 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
1:M 26 Feb 2023 20:58:33.462 * monotonic clock: POSIX clock_gettime
1:M 26 Feb 2023 20:58:33.463 * Running mode=standalone, port=6379.
1:M 26 Feb 2023 20:58:33.463 # Server initialized
1:M 26 Feb 2023 20:58:33.465 * Ready to accept connections
```

From the logs, we see that the redis server is running. This is also shown below:

In [78]:
!docker ps --no-trunc

CONTAINER ID                                                       IMAGE     COMMAND                               CREATED         STATUS         PORTS      NAMES
9582634cf49dab707b956c1fc24c9d219e2260e0dd16ab44e9c252c721023b50   redis     "docker-entrypoint.sh redis-server"   8 minutes ago   Up 8 minutes   6379/tcp   suspicious_wozniak


Since the server is isolated inside the container, we need to start the client inside the container to access it. This can be done using `exec` with the `-it` interactive flag since we want to maintain control over the client using our terminal.

In [81]:
!docker exec -it 9582634cf49d redis-cli

127.0.0.1:6379> 


```{figure} diagrams/00-interactive.svg
---
width: 80%
name: exec
---
Executing a command from our terminal to a running process inside the container. Each running Linux process in a container has STDIN, STDOUT, and STDERR channels. These channels connect with the terminal during interactive mode.
```

**Remark.** Note that we can also `docker run` with an `-it` flag, e.g. with `bash` or `sh` to access the shell. Although this means that the default startup command will not run. This can be useful for exploring the default file system without the command running.

### Container isolation

Note that containers have **complete isolated** file systems by default. For example, we can blow up a container and just create a fresh container from the same image. This also ensures that our running processes are isolated from the host computer.

```bash
$ docker run -it redis bash
root@ecba8b4819f8:/data#
root@ecba8b4819f8:/# rm -rf * > /dev/null
...
rm: cannot remove 'sys/module/workqueue/parameters/watchdog_thresh': Read-only file system
rm: cannot remove 'sys/module/workqueue/parameters/disable_numa': Read-only file system
rm: cannot remove 'sys/module/workqueue/parameters/debug_force_rr_cpu': Read-only file system
rm: cannot remove 'sys/module/workqueue/parameters/power_efficient': Read-only file system
rm: cannot remove 'sys/module/tpm/uevent': Read-only file system
rm: cannot remove 'sys/module/tpm/parameters/suspend_pcr': Read-only file system
rm: cannot remove 'sys/module/tpm/version': Read-only file system
rm: cannot remove 'sys/module/sr_mod/uevent': Read-only file system
rm: cannot remove 'sys/module/sr_mod/parameters/xa_test': Read-only file system
rm: cannot remove 'sys/module/ip_vs_ftp/uevent': Read-only file system
rm: cannot remove 'sys/module/ip_vs_ftp/parameters/ports': Read-only file system
root@ecba8b4819f8:/# ls
bash: /bin/ls: No such file or directory
```

Creating a new container that runs `ls`. Note that the container ID is different:

```bash
❯ docker run -it redis bash
root@c8ee0b59eb85:/data# cd .. && ls
bin  boot  data  dev  etc  home  lib  media  mnt  opt  proc  root  run	sbin  srv  sys	tmp  usr  var
root@c8ee0b59eb85:/#
```

## Building custom images

Throughout the above examples we have been using images created by other engineers and uploaded with public access to Docker Hub. We want to know how to create our own images so that we can run our own applications inside custom containers. This also means that our custom images can be uploaded to an image registry (e.g. Docker Hub) which our services can then pull to run our applications. This can be done using a **Dockerfile** which can be thought of as container config.

```{figure} diagrams/00-images.svg
---
width: 80%
name: images
---
Existing images from Docker Hub.
```

### Dockerfile

Dockerfiles start with specifying a **base image**. This makes sense for the level of abstraction that we are working in and because creating an image from scratch is unimaginably hard. To demonstrate this, we create Dockerfile which runs the `redis-server`:

In [84]:
!tree redis-image

[01;34mredis-image[0m
└── [00mDockerfile[0m

0 directories, 1 file


```{figure} diagrams/00-dockerfile.svg
---
name: dockerfile
---
How Dockerfile works for creating images and the general process of writing a Dockerfile.
```

We use [`alpine`](https://hub.docker.com/_/alpine) as base image which is based on [Alpine Linux](https://www.alpinelinux.org/). For our purposes, we choose this as a minimal image (only 5 MB!) that come with a sufficiently useful preinstalled set of programs.

In [85]:
!cat redis-image/Dockerfile

FROM alpine

RUN apk add --update redis

CMD ["redis-server"]


### Docker build process

Docker [build](https://docs.docker.com/engine/reference/commandline/build) is a complex process that simulates actual sequential installation steps starting from the base image. A container is first created from the base image which runs the process given by the first `RUN` command. A snapshot is taken of the resulting FS to get an intermediate image that is used as the base of the next `RUN` command. This is repeated sequentially with each `RUN` command. Finally, the last intermediate container is created with the startup command changed to that in `CMD`. An FS snapshot of this container is taken with the modified startup command. 

**Remark.** Note that all intermediate containers are shut down and removed unless we set `--rm=false`. Keeping the intermediate containers can be useful for debugging.

In [96]:
!docker build redis-image --quiet

sha256:391276f9cfebc3f5c1e35051cd061a4c58ea352552fe7ef89d37ff507207c3d5


```{figure} diagrams/00-imagebuild.svg
---
width: 80%
name: imagebuild
---
Building based on `alpine:latest` sequentially installing `redis` and changing the startup command to `redis-server`.
```

To see the built image:

In [97]:
!docker image ls

REPOSITORY   TAG       IMAGE ID       CREATED         SIZE
<none>       <none>    391276f9cfeb   3 minutes ago   13.4MB
redis        latest    e79ba23ed43b   2 weeks ago     111MB
busybox      latest    abaa813f94fd   7 weeks ago     3.73MB


This can be tagged to get a more descriptive image. Note that this is the same image:

In [99]:
!docker build -t okt/redis:latest redis-image --quiet

sha256:391276f9cfebc3f5c1e35051cd061a4c58ea352552fe7ef89d37ff507207c3d5


In [100]:
!docker image ls

REPOSITORY   TAG       IMAGE ID       CREATED          SIZE
okt/redis    latest    391276f9cfeb   36 minutes ago   13.4MB
redis        latest    e79ba23ed43b   2 weeks ago      111MB
busybox      latest    abaa813f94fd   7 weeks ago      3.73MB


Running now the container server from the background using `docker run okt/redis`. Note that this automatically runs the latest version. We will access the redis client on this container using the `exec` command:

In [109]:
!docker ps

CONTAINER ID   IMAGE       COMMAND          CREATED              STATUS              PORTS     NAMES
d613792f4c39   okt/redis   "redis-server"   About a minute ago   Up About a minute             strange_elgamal


In [110]:
!docker exec d613792f4c39 redis-cli set hello world

OK


In [111]:
!docker exec d613792f4c39 redis-cli get hello

world
