# Introduction to Containers: units of software?

**James Meakin - Wednesday 23 Jan, DIAG UoK Workshop**

<img src="docker.png" width="150"/>


## Overview

- Motivation
- What are containers?
- Why containers?
- What is docker?
- How do I create a container?

## Motivation

We want to run processes:
 - a process to train your model
 - a process to apply that model to new cases. 

To do this the process needs to be packaged with all its dependencies:
 - data (image data, model weights)
 - runtime environment (linux, windows)
 - external libraries (python, CUDA, ...)

## How would we bundle a process and its dependencies?

- Install an OS into a Virtual Machine Image (eg, Virtualbox)
- Add all the executables and dependencies (by hand?)
- Shutdown the machine, package, upload somewhere, document on a wiki

## Virtual Machines

<img src="VM@2x.png" width="300">

VMs abstract hardware, allowing many systems to run on the same physical infrastructure. 

Problems:
 - Size
 - Speed
 - Hypervisor Compatibility


## What are containers?

<img src="Container@2x.png" width="300">

> Containers are a way of packaging a process
>  - self-contained
>  - portable
>  - lightweight

## How do containers compare to VMs?

<div>
<img src="VM@2x.png" width="300" style="float:left"><img src="Container@2x.png" width="300" style="float:right">
</div>

<div style="float:left">
Containers abstract away the application level, virtual machines abstract the hardware.

<ul>
    <li> share the same kernel (start instantly, use less memory) </li>
    <li> don't need to include an entire os </li>
    <li> are now a standard </li>
</ul>
</div>


## Containers are not a primative

Low level:

> A combination of two linux kernel primitives:
>  - `namespaces`: controls what the process can **see** (pid, mount, network, ipc, user, ...)
>  - `cgroups`: controls what a process can **use** (memory, cpu, blkio, cpuset, devices, ...)

## Why Containers?

Containers help us solve 3 problems:
- Encapsulation
  - All necessary files, isolation
- Dependency Management
  - All necessary libraries
- Immutability
  - We can easily keep many different versions of container images
  - Changes that you make on top on the image do not persist!

## What is Docker?

Docker is a **containerization platform**. It provides:
- a format for building and packaging containers: `container image`
- a place to download, upload and persist container images: `docker registry`
- tools for running containers on individual hosts


## Container Registry

[Demo](https://hub.docker.com)

Use the container registry to `pull` and `push` container images to/from your local machine.

## Running images

Use `docker run` to start images. Useful extras:
- `-p`: expose a port
- `-i`: interative mode
- `-t`: allocate a pseudo-TTY
- `-d`: detach - run in the background
- `-e`: environment variable

Note that any changes won't be saved!
- `-v`: attach a volume

Useful commands:
- `docker image list`
- `docker volume list`
- `docker ps`
- `docker stats`
- `docker stop`
- `docker system prune`


## Container Images

Use a `dockerfile` to create container images.

Choose the `FROM` image carefully: ubuntu is 188MB, debian 125MB, alpine 5MB. Try `python:3.6-slim` or `python:3.6-alpine`.

Build them with docker build.
- Remember the layering, don't ever include secrets.

