## Overlay Filesystems Demo

This notebook explores overlay filesystems and Dockers use of this feature. In the notebook you will build and run some Docker containers and explore how the layered filesystem specified by the Dockerfile are presented in the filesystem. 

### Preparation

This lab walks through the Docker overlay filesystem. In order to clearly observe the folders created here, the following command can be executed to reset the docker installation and remove all cached images. 

**Do not execute this on a production system or one containing data you cannot remove**


In [11]:
sudo docker system prune --all --force --volumes

Deleted Containers:
a7464de7a665a2eafadc572b0a27b2663e969d8d983b26ac0e466858d032f4be

Deleted Volumes:
844f968ab61fc2cc12569b53ca229503dff0b261c6236da46ad5a4a72dc2e398

Deleted Images:
untagged: amazonlinux:2
untagged: amazonlinux@sha256:d4a4328d679534af47c7a765d62a9195eb27f9a95c03213fca0a18f95aa112cd
deleted: sha256:01da4f8f9748b3ac6cf5d265152fb80b9d7545075be8aa0a3d60770a98db9768
deleted: sha256:1c729a602f80e0984b76377e1168f5c0e42d0b92acbbbacfc19d983b06cd7565
untagged: redis:latest
untagged: redis@sha256:000339fb57e0ddf2d48d72f3341e47a8ca3b1beae9bdcb25a96323095b72a79b
deleted: sha256:a55fbf438dfd878424c402e365ef3d80c634f07d0f5832193880ee1b95626e4e
deleted: sha256:bd436209688cc9495c35573533b0a02bcb66abef9f686930c6a8532b9083182b
deleted: sha256:0d61b290c44d5b4a7f096a665b3a62fc233d927f1b402e95571eb9cd4cc4fe09
deleted: sha256:f42fd41b71c4634d1b960afa6678e751c7a45f6b50b6043da9092f3f90b9e37b
deleted: sha256:ce68cd4cf8096b42a1ffe1349f19f364882da50157b4ba35bb499a8183d362b1
deleted: sha256:0c4

The following two commands restart docker, which will clear out any running containers to further reset the demo environment

In [9]:
sudo systemctl stop docker
sudo systemctl start docker

# Exploring Docker's use of filesystems

Docker stores container filesystems under /var/lib/docker/overlay2. Before you run any containers, the folder contains two objects:

In [12]:
sudo ls /var/lib/docker/overlay2 -l

total 0
brw------- 1 root root 202, 1 May  1 05:01 backingFsBlockDev
drwx------ 2 root root      6 May  1 05:01 l


Lets download a simple, 1-layer, container and check how this is represented in the filesystem

In [13]:
docker pull amazonlinux:2

2: Pulling from library/amazonlinux

[1BDigest: sha256:d4a4328d679534af47c7a765d62a9195eb27f9a95c03213fca0a18f95aa112cd
Status: Downloaded newer image for amazonlinux:2


In [14]:
sudo ls /var/lib/docker/overlay2/ -l

total 0
drwx------ 3 root root     30 May  1 05:07 37454eb4b45b90506f54a5b2a40ef1b1ed334f162f14e931e8f643f6ab4d714f
brw------- 1 root root 202, 1 May  1 05:01 backingFsBlockDev
drwx------ 2 root root     40 May  1 05:07 l


Docker provides a metadata description of the container which is accessible via the inspect command. In the JSON document returned, the overlay filesystem is documented under the GraphDriver section. We can extract and verify this matches the folder shown above as follows:

In [15]:
docker inspect --format='{{.GraphDriver.Data.MergedDir}}' "amazonlinux:2"

/var/lib/docker/overlay2/37454eb4b45b90506f54a5b2a40ef1b1ed334f162f14e931e8f643f6ab4d714f/merged


In [16]:
FS_PATH=$(docker inspect --format='{{.GraphDriver.Data.MergedDir}}' "amazonlinux:2" | rev | cut -d/ -f2- | rev)
echo $FS_PATH

/var/lib/docker/overlay2/37454eb4b45b90506f54a5b2a40ef1b1ed334f162f14e931e8f643f6ab4d714f


Lets explore the contents of the container folder:

In [17]:
sudo ls $FS_PATH -l

total 4
drwxr-xr-x 18 root root 237 May  1 05:07 diff
-rw-r--r--  1 root root  26 May  1 05:07 link


The diff folder contains the data stored in this layer of the filesystem. We can explore this like any other folder:

In [18]:
sudo ls $FS_PATH/diff

bin   dev  home  lib64	media  opt   root  sbin  sys  usr
boot  etc  lib	 local	mnt    proc  run   srv	 tmp  var


This looks like a normal linux root filesystem!

The other file present is a text file, link. We can read the contents of this file: 

In [19]:
sudo cat $FS_PATH/link

R7J4UYFYCXOBF26MGEZUUO2WRS

The link file maps back to a symlink stored within the 'l' folder in the root of the /var/lib/docker/overlay2/ folder, which in turn points back to the diff folder containing our container filesystem. 

This behavior is a Docker-specific implementation detail, and not something particular to the use of union/overlay filesystems:

In [21]:
sudo ls /var/lib/docker/overlay2/l -l

total 0
lrwxrwxrwx 1 root root 72 May  1 05:07 R7J4UYFYCXOBF26MGEZUUO2WRS -> ../37454eb4b45b90506f54a5b2a40ef1b1ed334f162f14e931e8f643f6ab4d714f/diff


The other file in the root of the docker filesystem is another Docker specific implementation detail, the backingFsBlockDev file, which is a block device that maps to the root block device for the host OS:

In [22]:
sudo ls /var/lib/docker/overlay2/ -l
sudo lsblk

total 0
drwx------ 3 root root     30 May  1 05:07 37454eb4b45b90506f54a5b2a40ef1b1ed334f162f14e931e8f643f6ab4d714f
brw------- 1 root root 202, 1 May  1 05:01 backingFsBlockDev
drwx------ 2 root root     40 May  1 05:07 l
NAME    MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
xvda    202:0    0   8G  0 disk 
└─xvda1 202:1    0   8G  0 part /


### Multiple Layers

In the layer-example folder, I have prepared a simple Dockerfile which presents a 3-layer filesystem:

- Layer 0: Base image (Amazon Linux 2)
- Layer 1: Adds a file: /hello
- Layer 2: Removes the file: /hello

Lets look at the Dockerfile:

In [24]:
cd layer-example
cat Dockerfile

bash: cd: layer-example: No such file or directory
FROM amazonlinux:2
RUN touch /hello
RUN rm /hello


Lets build the image:

In [25]:
docker build -t layer-example .

Sending build context to Docker daemon  2.048kB
Step 1/3 : FROM amazonlinux:2
 ---> 01da4f8f9748
Step 2/3 : RUN touch /hello
 ---> Running in 9e0e7e9e88ae
Removing intermediate container 9e0e7e9e88ae
 ---> b75748f8d30c
Step 3/3 : RUN rm /hello
 ---> Running in 55d88dfc1624
Removing intermediate container 55d88dfc1624
 ---> 40514762916c
Successfully built 40514762916c
Successfully tagged layer-example:latest


From the build log, you can see that Docker passed through three steps (one for each line in the Dockerfile) and created layers for each. 

Now lets refresh the Docker overlay2 folder to see what's changed:

In [26]:
sudo ls /var/lib/docker/overlay2/ -latr

total 0
brw-------  1 root root 202, 1 May  1 05:01 backingFsBlockDev
drwx--x--x 15 root root    200 May  1 05:01 ..
drwx------  3 root root     30 May  1 05:07 37454eb4b45b90506f54a5b2a40ef1b1ed334f162f14e931e8f643f6ab4d714f
drwx------  4 root root     55 May  1 05:25 d7e52b9e930112ff10a57ed8de02af7531449717a307158090ca671e34de86c6
drwx------  4 root root     55 May  1 05:25 905b55b8af005dfcd17e3819f97261b828be42679e7beaa8e55f608018d37824
drwx------  2 root root    108 May  1 05:25 l
drwx------  6 root root    256 May  1 05:25 .


Two new directories have been created, but it's not clear which is which layer in the fs. It's not obvious where these IDs have come from. 

We can find the IDs by going back to the 'docker inspect' command and pulling these from the GraphDriver section. 

The Docker metadata includes a LowerDir value:

In [27]:
LOWER_DIRS=$(docker inspect --format='{{.GraphDriver.Data.LowerDir}}' "layer-example")
echo $LOWER_DIRS

/var/lib/docker/overlay2/d7e52b9e930112ff10a57ed8de02af7531449717a307158090ca671e34de86c6/diff:/var/lib/docker/overlay2/37454eb4b45b90506f54a5b2a40ef1b1ed334f162f14e931e8f643f6ab4d714f/diff


This value shows a heirarchy of the lower folders which layer from right to left (bottom layer in the filesystem is the last element in the list)

We can extract the middle layer (first element in the list) with some shell cut commands:

In [28]:
MIDDLE_DIR=$(echo $LOWER_DIRS | cut -d':' -f1 | rev | cut -d/ -f2- | rev)
echo $MIDDLE_DIR

/var/lib/docker/overlay2/d7e52b9e930112ff10a57ed8de02af7531449717a307158090ca671e34de86c6


In [29]:
sudo ls $MIDDLE_DIR -l

total 8
drwxr-xr-x 2 root root 19 May  1 05:25 diff
-rw-r--r-- 1 root root 26 May  1 05:25 link
-rw-r--r-- 1 root root 28 May  1 05:25 lower
drwx------ 2 root root  6 May  1 05:25 work


Lets explore the diff folder in the middle layer, which contains the changes made in this layer:

In [30]:
sudo ls $MIDDLE_DIR/diff -l

total 0
-rw-r--r-- 1 root root 0 May  1 05:25 hello


So this diff shows the creation of the hello file. This corresponds to line 2 in our Dockerfile. 

To validate this, lets check the 'lower' file which contains the ID of the layer that is below this one. 

In [31]:
sudo cat $MIDDLE_DIR/lower

l/R7J4UYFYCXOBF26MGEZUUO2WRS

Thats interesting, because this is the same value we saw earlier in the single layer example. This is the ID of our base amazonlinux:2 image and shows how Docker uses the filesystem to efficiently navigate through layers, as well as sharing them between based on the same image.

Next, lets explore the other newly created layer, which logically, should be the top. We can extract this from the Docker metadata via the UpperDir variable:

In [32]:
UPPER_DIR=$(docker inspect --format='{{.GraphDriver.Data.UpperDir}}' "layer-example" | rev | cut -d/ -f2- | rev)
echo $UPPER_DIR

/var/lib/docker/overlay2/905b55b8af005dfcd17e3819f97261b828be42679e7beaa8e55f608018d37824


In [33]:
sudo ls $UPPER_DIR

diff  link  lower  work


This looks very similar to the previous layer, but if we look at the lower file, we'll see the heirarchy:

In [34]:
sudo cat $UPPER_DIR/lower

l/BXEUNNTRSREEVJO2MKZ5LI3AA3:l/R7J4UYFYCXOBF26MGEZUUO2WRS

So this is the top layer. In this layer, the hello file was removed. How does that work? 

In [35]:
sudo ls $UPPER_DIR/diff -l

total 0
c--------- 1 root root 0, 0 May  1 05:25 hello


So to wrap up, you can see the hello file was removed and this is expressed via the special 'c' flag in the permissions structure. This tells the overlay filesystem driver to present the unified filesystem with this file not present, however, as you can see from the above walk through, the original file exists and is stored on disk in the middle layer, it's just hidden by this tombstone that sits over the top. 