# Storage and volumes

Consider what it might look like to run a database program inside a container.
You would package the software with the image, and when you start the container,
it might initialize an empty database. 
- When programs connect to the database and enter data, where is that data stored? 
- Is it in a file inside the container? 
- What happens to that data when you stop the container or remove it? 
- How would you move your data if you wanted to upgrade the database program? 
- What happens to that storage on a cloud machine when it is terminated?

Consider another situation where you’re running a couple of different web applications
inside different containers. 
- Where would you write log files so that they will outlive the container? 
- How would you get access to those logs to troubleshoot a problem?
- How can other programs such as log digest tools get access to those files?

The union filesystem is not appropriate for working with long-lived data or sharing
data between containers, or a container and the host. The answer to all these questions
involves managing the container filesystem and mount points.

## File trees and mount points

Unlike other operating systems, Linux unifies all storage into a single tree. Storage
devices such as disk partitions or USB disk partitions are attached to specific locations
in that tree. Those locations are called `mount points`. A mount point defines the location
in the tree, the access properties to the data at that point (for example, writability),
and the source of the data mounted at that point (for example, a specific hard
disk, USB device, or memory-backed virtual disk). Figure 4.1 depicts a filesystem constructed
from multiple storage devices, with each device mounted to a specific location
and level of access.

Mount points allow software and users to use the file tree in a Linux environment
without knowing exactly how that tree is mapped into specific storage devices. This is
particularly useful in container environments.

Every container has something called a MNT namespace and a unique file tree root.
This is discussed in detail in chapter 6. For now, it is enough to understand that the
image that a container is created from is mounted at that container’s file tree root, or
at the / point, and that every container has a different set of mount points.

Logic follows that if different storage devices can be mounted at various points in a
file tree, we can mount nonimage-related storage at other points in a container file
tree. That is exactly how containers get access to storage on the host filesystem and
share storage between containers.

The rest of this chapter elaborates on how to manage storage and the mount
points in containers. The best place to start is by understanding the three most common
types of storage mounted into containers:
- Bind mounts
- In-memory storage
- Docker volumes

These storage types can be used in many ways. Figure 4.2 shows an example of a container
filesystem that starts with the files from the image, adds an in-memory tmpfs at
/tmp, bind-mounts a configuration file from the host, and writes logs into a Docker
volume on the host.

<img style="-webkit-user-select: none;margin: auto;" src="https://dpzbhybb2pdcj.cloudfront.net/nickoloff2/Figures/04fig02_alt.jpg">

All three types of mount points can be created using the `--mount` flag on the `docker run` and `dcker create` subcommands.

## Bind mounts

Bind mounts are mount points used to remount parts of a filesystem tree onto other
locations. When working with containers, bind mounts attach a user-specified location
on the host filesystem to a specific point in a container file tree. Bind mounts are useful
when the host provides a file or directory that is needed by a program running in a
container, or when that containerized program produces a file or log that is processed
by users or programs running outside containers.

Consider the example in figure 4.3. Suppose you’re running a web server that
depends on sensitive configuration on the host and emits access logs that need to be
forwarded by your log-shipping system. You could use Docker to launch the web
server in a container and bind-mount the configuration location as well as the location
where you want the web server to write logs.

<img style="-webkit-user-select: none;margin: auto;" src="https://dpzbhybb2pdcj.cloudfront.net/nickoloff2/Figures/04fig03_alt.jpg">

You can try this for yourself. Create a placeholder log file and create a special NGINX
configuration file named example.conf. Run the following commands to create and
populate the files:

`gradiva/vaja01`

    touch ./example.log
    cat >./example.conf <<EOF
    server {
        listen 80;
        server_name localhost;
        access_log /var/log/nginx/custom.host.access.log main;
        location / {
            root /usr/share/nginx/html;
            index index.html index.htm;
        }
    }
    EOF

Once a server is started with this configuration file, it will offer the NGINX default site
at `http://localhost/`, and access logs for that site will be written to a file in the container
at `/var/log/nginx/custom.host.access.log`. The following command will start
an NGINX HTTP server in a container where your new configuration is bind-mounted
to the server’s configuration root:

> Originally, the -v or --volume flag was used for standalone containers and the --mount flag was used for swarm services. However, starting with Docker 17.06, you can also use --mount with standalone containers. In general, --mount is more explicit and verbose. The biggest difference is that the -v syntax combines all the options together in one field, while the --mount syntax separates them. Here is a comparison of the syntax for each flag. New users should try --mount syntax which is simpler than --volume syntax.

    MAIN_PATH=/home/leon11/docker-k8s/docker/Storage_and_volumes/gradiva/vaja01; \
    CONF_SRC=${MAIN_PATH}/example.conf; \
    CONF_DST=/etc/nginx/conf.d/default.conf; \
    LOG_SRC=${MAIN_PATH}/example.log; \
    LOG_DST=/var/log/nginx/custom.host.access.log; \
    docker run -d --name diaweb \
        --mount type=bind,src=${CONF_SRC},dst=${CONF_DST} \
        --mount type=bind,src=${LOG_SRC},dst=${LOG_DST} \
        -p 80:80 \
        nginx:latest

With this container running, you should be able to point your web browser at
`http://localhost/` and see the NGINX hello-world page, and you will not see any
access logs in the container log stream: docker logs diaweb. However, you will be
able to see those logs if you examine the example.log file in your home directory:
`cat ./example.log`.

In this example you used the  `--mount` option with the `type=bind` option. The
other two mount parameters, src and dst, define the source location on the host file
tree and the destination location on the container file tree. You must specify locations
with absolute paths, but in this example, we used shell expansion and shell variables to
make the command easier to read.

This example touches on an important feature of volumes. When you mount a volume
on a container filesystem, it replaces the content that the image provides at that
location. By default, the `nginx:latest` image provides some default configuration at
`/etc/nginx/conf.d/default.conf`, but when you created the bind mount with a destination
at that path, the content provided by the image was overridden by the content
on the host. This behavior is the basis for the polymorphic container pattern discussed
later in the chapter.

Expanding on this use case, suppose you want to make sure that the NGINX web
server can’t change the contents of the configuration volume. Even the most trusted
software can contain vulnerabilities, and it’s best to minimize the impact of an attack on
your website. Fortunately, Linux provides a mechanism to make mount points read-only.
You can do this by adding the `readonly=true` argument to the mount specification. In
the example, you should change the run command to something like the following:

    docker rm -f diaweb

    MAIN_PATH=/home/leon11/docker-k8s/docker/Storage_and_volumes/gradiva/vaja01; \
    CONF_SRC=${MAIN_PATH}/example.conf; \
        CONF_DST=/etc/nginx/conf.d/default.conf; \
        LOG_SRC=${MAIN_PATH}/example.log; \
    LOG_DST=/var/log/nginx/custom.host.access.log; \
    docker run -d --name diaweb \
        --mount type=bind,src=${CONF_SRC},dst=${CONF_DST},readonly=true \
        --mount type=bind,src=${LOG_SRC},dst=${LOG_DST} \
        -p 80:80 \
        nginx:latest

By creating the read-only mount, you can prevent any process inside the container from
modifying the content of the volume. You can see this in action by running a quick test:

    docker exec diaweb \
    sed -i "s/listen 80/listen 8080/" /etc/nginx/conf.d/default.conf

This command executes a sed command inside the diaweb container and attempts
to modify the configuration file. The command fails because the file is mounted as
read-only.

The first problem with bind mounts is that they tie otherwise portable container
descriptions to the filesystem of a specific host. If a container description depends on
content at a specific location on the host filesystem, that description isn’t portable to
hosts where the content is unavailable or available in some other location.

The next big problem is that they create an opportunity for conflict with other
containers. It would be a bad idea to start multiple instances of Cassandra that all use
the same host location as a bind mount for data storage. In that case, each of the
instances would compete for the same set of files. Without other tools such as file
locks, that would likely result in corruption of the database.

Bind mounts are appropriate tools for workstations, machines with specialized
concerns, or in systems combined with more traditional configuration management
tooling. It’s better to avoid these kinds of specific bindings in generalized platforms or
hardware pools.

## In-memory storage

Most service software and web applications use private key files, database passwords,
API key files, or other sensitive configuration files, and need upload buffering space.
In these cases, it is important that you never include those types of files in an image or
write them to disk. Instead, you should use in-memory storage. You can add in-memory
storage to containers with a special type of mount.

Set the type option on the mount flag to tmpfs. This is the easiest way to mount a
memory-based filesystem into a container’s file tree. Consider this command:

    docker run --rm \
        --mount type=tmpfs,dst=/tmp \
        --entrypoint mount \
        alpine:latest -v

This command creates an empty tmpfs device and attaches it to the new container’s
file tree at /tmp. Any files created under this file tree will be written to memory
instead of disk. More than that, the mount point is created with sensible defaults for
generic workloads. Running the command will display a list of all the mount points
for the container. The list will include the following line:

    tmpfs on /tmp type tmpfs (rw,nosuid,nodev,noexec,relatime)

This line describes the mount-point configuration. From left-to-right it indicates the
following:
- A tmpfs device is mounted to the tree at /tmp.
- The device has a tmpfs filesystem.
- The tree is read/write capable.
- suid bits will be ignored on all files in this tree.
- No files in this tree will be interpreted as special devices.
- No files in this tree will be executable.
- File access times will be updated if they are older than the current modify or change time.

Additionally, the tmpfs device will not have any size limits by default and will be worldwritable
(has file permissions 1777 in octal). You can add a size limit and change the
file mode with two additional options: tmpfs-size and tmpfs-mode:

    docker run --rm \
        --mount type=tmpfs,dst=/tmp,tmpfs-size=16k,tmpfs-mode=1770 \
        --entrypoint mount \
        alpine:latest -v

This command limits the tmpfs device mounted at /tmp to 16 KB and is not readable
by other in-container users.

## Docker volumes

Docker volumes are named filesystem trees managed by Docker. They can be implemented
with disk storage on the host filesystem, or another more exotic backend such
as cloud storage. All operations on Docker volumes can be accomplished using the
`docker volume` subcommand set. Using volumes is a method of decoupling storage
from specialized locations on the filesystem that you might specify with bind mounts.

If you were to convert the web server and log-forwarding container example from
section 4.2 to use a volume for sharing access to the logs, the pair could run on any
machine without considering other software that might have a conflict with static locations
on disk. That example would look like figure 4.4, and the containers would read
and write logs through the location-example volume.

You can create and inspect volumes by using the `docker volume create` and
`docker volume inspect` subcommands. By default, Docker creates volumes by using
the local volume plugin. The default behavior will create a directory to store the contents
of a volume somewhere in a part of the host filesystem under control of the
Docker engine. For example, the following two commands will create a volume
named location-example and display the location of the volume host filesystem tree:

    docker volume create \
        --driver local \
        --label example=location \
        location-example

    docker volume inspect \
        --format "{{json .Mountpoint}}" \
        location-example

Docker volumes may seem difficult to work with if you’re manually building or linking
tools together on your desktop, but for larger systems in which specific locality
of the data is less important, volumes are a much more effective way to organize
your data. Using them decouples volumes from other potential concerns of the system.
By using Docker volumes, you’re simply stating, “I need a place to put some
data that I’m working with.” This is a requirement that Docker can fill on any
machine with Docker installed.

Further, when you’re finished with a volume and you ask Docker to clean things up
for you, Docker can confidently remove any directories or files that are no longer
being used by a container. Using volumes in this way helps manage clutter. As Docker
middleware or plugins evolve, volume users will be able to adopt more advanced
features.

Sharing access to data is a key feature of volumes. If you have decoupled volumes
from known locations on the filesystem, you need to know how to share volumes
between containers without exposing the exact location of managed containers. The
next section describes two ways to share data between containers by using volumes.

### Volumes provide container-independent data management

Semantically, a volume is a tool for segmenting and sharing data that has a scope or life
cycle that’s independent of a single container. That makes volumes an important part
of any containerized system design that shares or writes files. Examples of data that
differs in scope or access from a container include the following:
- Database software versus database data
- Web application versus log data
- Data processing application versus input and output data
- Web server versus static content
- Products versus support tools

Volumes enable separation of concerns and create modularity for architectural components.
That modularity helps you understand, build, support, and reuse parts of
larger systems more easily.

Think about it this way: images are appropriate for packaging and distributing relatively
static files such as programs; volumes hold dynamic data or specializations. This
distinction makes images reusable and data simple to share. This separation of relatively
static and dynamic file space allows application or image authors to implement
advanced patterns such as polymorphic and composable tools.

A polymorphic tool is one that maintains a consistent interface but might have several
implementations that do different things. Consider an application such as a general
application server. Apache Tomcat, for example, is an application that provides
an HTTP interface on a network and dispatches any requests it receives to pluggable
programs. Tomcat has polymorphic behavior. Using volumes, you can inject behavior
into containers without modifying an image. Alternatively, consider a database program
like MongoDB or MySQL. The value of a database is defined by the data it contains.
A database program always presents the same interface but takes on a wholly
different value depending on the data that can be injected with a volume. The polymorphic
container pattern is the subject of section 4.5.1.