# Resource controls

The features covered in this chapter focus on managing or limiting the risks of
running software. These features prevent software from misbehaving because of a
bug or attack from consuming resources that might leave your computer unresponsive.
Containers can help ensure that software only uses the computing resources
and accesses the data you expect. You will learn how to give containers resource
allowances, access shared memory, run programs as specific users, control the type
of changes that a container can make to your computer, and integrate with other
Linux isolation tools. Some of these topics involve Linux features that are beyond the
scope of this book. In those cases, we try to give you an idea about their purpose and
basic usage examples, and how you can integrate them with Docker. Figure 6.1 shows
the eight namespaces and features that are used to build Docker containers.

<img style="-webkit-user-select: none;margin: auto;" src="https://dpzbhybb2pdcj.cloudfront.net/nickoloff2/Figures/06fig01_alt.jpg">

## Setting resource allowances

Physical system resources such as memory and time on the CPU are scarce. If the
resource consumption of processes on a computer exceeds the available physical
resources, the processes will experience performance issues and may stop running.
Part of building a system that creates strong isolation includes providing resource
allowances on individual containers.

If you want to make sure that a program won’t overwhelm other programs on your
computer, the easiest thing to do is set limits on the resources that it can use. You can
manage memory, CPU, and device resource allowances with Docker. By default,
Docker containers may use unlimited CPU, memory, and device I/O resources. The
docker container create and run commands provide flags for managing resources
available to the container.

### Memory limits

Memory limits are the most basic restriction you can place on a container. They restrict
the amount of memory that processes inside a container can use. Memory limits are
useful for ensuring that one container can’t allocate all of the system’s memory, starving
other programs for the memory they need. You can put a limit in place by using
the `-m or --memory` flag on the docker container run or docker container create
commands. The flag takes a value and a unit. The format is as follows:

    <number><optional unit> where unit = b, k, m or g

In the context of these commands, b refers to bytes, k to kilobytes, m to megabytes, and
g to gigabytes. Put this new knowledge to use and start up a database application that
you’ll use in other examples:

    docker container run -d --name ch6_mariadb \
        --memory 256m \
        --cpu-shares 1024 \
        --cap-drop net_raw \
        -e MYSQL_ROOT_PASSWORD=test \
        mariadb:5.5

With this command, you install database software called MariaDB and start a container
with a memory limit of 256 megabytes. You might have noticed a few extra
flags on this command. This chapter covers each of those, but you may already be
able to guess what they do. Something else to note is that you don’t expose any ports
or bind any ports to the host’s interfaces. It will be easiest to connect to this database
by linking to it from another container on the host. Before we get to that, we want to
make sure you have a full understanding of what happens here and how to use memory
limits.

The most important thing to understand about memory limits is that they’re not
reservations. They don’t guarantee that the specified amount of memory will be available.
They’re only a protection from overconsumption. Additionally, the implementation
of the memory accounting and limit enforcement by the Linux kernel is very
efficient, so you don’t need to worry about runtime overhead for this feature.

Before you put a memory allowance in place, you should consider two things. First,
can the software you’re running operate under the proposed memory allowance? Second,
can the system you’re running on support the allowance?

The first question is often difficult to answer. It’s not common to see minimum
requirements published with open source software these days. Even if it were, though,
you’d have to understand how the memory requirements of the software scale based
on the size of the data you’re asking it to handle. For better or worse, people tend to
overestimate and adjust based on trial and error. One option is to run the software in
a container with real workloads and use the **`docker stats`** command to see how
much memory the container uses in practice. For the mariadb container we just
started, docker stats ch6_mariadb shows that the container is using about 100 megabytes
of memory, fitting well inside its 256-megabyte limit. In the case of memorysensitive
tools like databases, skilled professionals such as database administrators
can make better-educated estimates and recommendations. Even then, the question
is often answered by another: how much memory do you have? And that leads to the
second question.

Can the system you’re running on support the allowance? It’s possible to set a
memory allowance that’s bigger than the amount of available memory on the system.
On hosts that have swap space (virtual memory that extends onto disk), a container
may realize the allowance. It is possible to specify an allowance that’s greater than any
physical memory resource. In those cases, the limitations of the system will always cap
the container, and runtime behavior will be similar to not having specified an allowance
at all.

Finally, understand that there are several ways that software can fail if it exhausts
the available memory. Some programs may fail with a memory access fault, whereas
others may start writing out-of-memory errors to their logging. Docker neither detects
this problem nor attempts to mitigate the issue. The best it can do is apply the restart
logic you may have specified using the --restart flag described in chapter 2.

> If your services or containers attempt to use more memory than the system has available, you may experience an Out Of Memory Exception (OOME) and a container, or the Docker daemon, might be killed by the kernel OOM killer. To prevent this from happening, ensure that your application runs on hosts with adequate memory.

### CPU

Processing time is just as scarce as memory, but the effect of starvation is performance
degradation instead of failure. A paused process that is waiting for time on the CPU is
still working correctly. But a slow process may be worse than a failing one if it’s running
an important latency-sensitive data-processing program, a revenue-generating
web application, or a backend service for your app. Docker lets you limit a container’s
CPU resources in two ways.

First, you can specify the relative weight of a container to other containers. Linux
uses this to determine the percentage of CPU time the container should use relative
to other running containers. That percentage is for the sum of the computing cycles
of all processors available to the container.

To set the CPU shares of a container and establish its relative weight, both `docker container run` and `docker container create` offer a `--cpu-shares` flag. The value
provided should be an integer (which means you shouldn’t quote it). Start another
container to see how CPU shares work:

    docker container run -d -P --name ch6_wordpress \
        --memory 512m \
        --cpu-shares 512 \
        --cap-drop net_raw \
        --link ch6_mariadb:mysql \
        -e WORDPRESS_DB_PASSWORD=test \
        wordpress:5.0.0-php7.2-apache

This command will download and start WordPress version 5.0. It’s written in PHP and
is a great example of software that has been challenged by adapting to security risks.
Here we’ve started it with a few extra precautions. If you’d like to see it running on
your computer, use `docker port ch6_wordpress` to get the port number (we’ll call it
<port>) that the service is running on and open `http://localhost:<port>` in your web
browser. If you’re using Docker Machine, you’ll need to use docker-machine ip to
determine the IP address of the virtual machine where Docker is running. When you
have that, substitute that value for localhost in the preceding URL.

When you started the MariaDB container, you set its relative weight (`cpu-shares`)
to 1024, and you set the relative weight of WordPress to 512. These settings create a
system in which the MariaDB container gets two CPU cycles for every one WordPress
cycle. If you started a third container and set its `--cpu-shares value` to 2048, it would
get half of the CPU cycles, and MariaDB and WordPress would split the other half at
the same proportions as they were before. Figure 6.2 shows how portions change
based on the total weight of the system.

<img style="-webkit-user-select: none;margin: auto;" src="https://dpzbhybb2pdcj.cloudfront.net/nickoloff2/Figures/06fig02_alt.jpg">

**CPU shares differ from memory limits in that they’re enforced only when there is contention
for time on the CPU.** If other processes and containers are idle, the container
may burst well beyond its limits. This approach ensures that CPU time is not wasted
and that limited processes will yield if another process needs the CPU. The intent of
this tool is to prevent one or a set of processes from overwhelming a computer, not to
hinder performance of those processes. The defaults won’t limit the container, and it
will be able to use 100% of the CPU if the machine is otherwise idle.

Now that you have learned how `cpu-shares` allocates CPU proportionately, we
will introduce the `cpus` option, which provides a way to limit the total amount of
CPU used by a container. The cpus option allocates a quota of CPU resources the
container may use by configuring the Linux Completely Fair Scheduler (CFS).
Docker helpfully allows the quota to be expressed as the number of CPU cores the
container should be able to use. The CPU quota is allocated, enforced, and ultimately
refreshed every 100ms by default. If a container uses all of its CPU quota, its
CPU usage will be throttled until the next measurement period begins. The following
command will let the previous WordPress example consume a maximum of 0.75
CPU cores:

    docker stop ch6_wordpress
    docker container rm ch6_wordpress

    docker container run -d -P --name ch6_wordpress \
        --memory 512m \
        --cpus 0.75 \
        --cap-drop net_raw \
        --link ch6_mariadb:mysql \
        -e WORDPRESS_DB_PASSWORD=test \
        wordpress:5.0.0-php7.2-apache

Another feature Docker exposes is the ability to assign a container to a specific CPU
set. Most modern hardware uses multicore CPUs. Roughly speaking, a CPU can process
as many instructions in parallel as it has cores. This is especially useful when you’re
running many processes on the same computer.

A context switch is the task of changing from executing one process to executing
another. Context switching is expensive and may cause a noticeable impact on the
performance of your system. In some cases, it makes sense to reduce context switching
of critical processes by ensuring they are never executed on the same set of CPU cores.
You can use the `--cpuset-cpus` flag on `docker container run` or `docker container create` to limit a container to execute only on a specific set of CPU cores.

You can see the CPU set restrictions in action by stressing one of your machine
cores and examining your CPU workload:

    # Start a container limited to a single CPU and run a load generator
    docker container run -d \
        --cpuset-cpus 0 \
        --name ch6_stresser dockerinaction/ch6_stresser
    
    # Start a container to watch the load on the CPU under load
    docker container run -it --rm dockerinaction/ch6_htop

Once you run the second command, you’ll see htop display the running processes
and the workload of the available CPUs. The ch6_stresser container will stop running
after 30 seconds, so it’s important not to delay when you run this experiment.
When you finish with htop, press Q to quit. Before moving on, remember to shut
down and remove the container named ch6_stresser:

    docker rm -vf ch6_stresser

We thought this was exciting when we first used it. To get the best appreciation, repeat
this experiment a few times by using different values for the `--cpuset-cpus` flag. If
you do, you’ll see the process assigned to different cores or different sets of cores. The
value can be either a list or range:

- 0,1,2—A list including the first three cores of the CPU
- 0-2—A range including the first three cores of the CPU

### Access to devices

Devices are the final resource type we will cover. Controlling access to devices differs
from memory and CPU limits. Providing access to a host’s device inside a container is
more like a resource-authorization control than a limit.

Linux systems have all sorts of devices, including hard drives, optical drives, USB
drives, mouse, keyboard, sound devices, and webcams. Containers have access to some
of the host’s devices by default, and Docker creates other devices specifically for each
container. This works similarly to how a virtual terminal provides dedicated input and
output devices to the user.

On occasion, it may be important to share other devices between a host and a
specific container. Say you’re running computer vision software that requires access
to a webcam, for example. In that case, you’ll need to grant access to the container
running your software to the webcam device attached to the system; you can use the
`--device` flag to specify a set of devices to mount into the new container. The following
example would map your webcam at /dev/video0 to the same location
within a new container. Running this example will work only if you have a webcam at
/dev/video0:

    docker container run -it --rm \
        --device /dev/video0:/dev/video0 \
        ubuntu:16.04 ls -al /dev

The value provided must be a map between the device file on the host operating system
and the location inside the new container. The device flag can be set many times
to grant access to different devices.

People in situations with custom hardware or proprietary drivers will find this kind
of access to devices useful. It’s preferable to resorting to modifying their host operating
system.

## Sharing memory

## Understanding users

Docker starts containers as the user that is specified by the image metadata by default,
which is often the root user. The root user has almost full privileged access to the state
of the container. Any processes running as that user inherit those permissions. It follows
that if there’s a bug in one of those processes, it might damage the container.
There are ways to limit the damage, but the most effective way to prevent these types
of issues is not to use the root user.

Reasonable exceptions exist; sometimes using the root user is the best or only available
option. You use the root user for building images, and at runtime when there’s
no other option. Similarly, at times you might want to run system administration software
inside a container. In those cases, the process needs privileged access not only to
the container but also to the host operating system. This section covers the range of
solutions to these problems.

### Working with the run-as user

Before you create a container, it would be nice to be able to know what username (and
user ID) is going to be used by default. The default is specified by the image. There’s
currently no way to examine an image to discover attributes such as the default user in
Docker Hub. You can inspect image metadata by using the docker inspect command.
If you missed it in chapter 2, the inspect subcommand displays the metadata of a specific
container or image. Once you’ve pulled or created an image, you can get the
default username that the container is using with the following commands:

    docker image pull busybox:1.29
    docker image inspect busybox:1.29
    #Shows only the runas user defined by the busybox image
    docker inspect --format "{{.Config.User}}" busybox:1.29

If the result is blank, the container will default to running as the root user. If it isn’t
blank, either the image author specifically named a default run-as user or you set a
specific run-as user when you created the container. The --format or -f option used
in the second command allows you to specify a template to render the output. In this
case, you’ve selected the User field of the Config property of the document. The value
can be any valid Golang template, so if you’re feeling up to it, you can get creative with
the results.

This approach has a problem. The run-as user might be changed by the entrypoint
or command the image uses to start up. These are sometimes referred to as boot, or init,
scripts. The metadata returned by docker inspect includes only the configuration that
the container will start with. So if the user changes, it won’t be reflected there.

Currently, the only way to fix this problem is to look inside the image. You could
expand the image files after you download them, and examine the metadata and init
scripts by hand, but doing so is time-consuming and easy to get wrong. For the time
being, it may be better to run a simple experiment to determine the default user. This
will solve the first problem but not the second:

    docker container run --rm --entrypoint "" busybox:1.29 whoami
        
    docker container run --rm --entrypoint "" busybox:1.29 id

This demonstrates two commands that you might use to determine the default user of
an image (in this case, busybox:1.29). Both the whoami and id commands are common
among Linux distributions, and so they’re likely to be available in any given
image. The second command is superior because it shows both the name and ID
details for the run-as user. Both these commands are careful to unset the entrypoint of
the container. This will make sure that the command specified after the image name is
the command that is executed by the container. These are poor substitutes for a firstclass
image metadata tool, but they get the job done.