

<img src='https://regmedia.co.uk/2015/09/29/docker_logo.jpg'/>

# Introduction to Docker

Docker is a way to package apps, and all of their dependencies, into **images** that can be easily loaded onto any operating system and executed ("instantiated") as a **container**

Docker containers are like a virtual machine (like the one you are using to take this course), but they are much much smaller because the image used to instantiate a container doesn't contain an operating system.  A container contains only the app, and the other things the app needs to run - all other functionality comes from the host operating system.  Apps are "installed" and configured inside of the container so that the user doesn't need to go through the installation or configuration process - the app "just works".

As data scientists, you are going to create a lot of software in your lives!  Docker (and other associated projects) is a good option for making your software available for others to use.

In this introductory course, we will only learn how to USE Docker images/containers, we will not learn how to CREATE them (this may be added to the Bioinformatics Programming Challenges course this year!).

Surf to:  https://hub.docker.com/explore/

This page lets you browse or search for images that have apps you want.  For example, if you wanted to write some code in Python, but didn't have Python installed, you could download the Python docker image (search for it).  Similarly, if you wanted to run the BLAST Nucleotide/amino acid sequence search program, you could use it's Docker distribution instead of installing it (search for Blast.... search for "bioinformatics")

## Installing Docker

Docker is, itself, an app, so it must be installed.  **I have done this for you already**, but here are some recent instructions for how to install it on Linux:

#### get the keys to validate the docker repository

    sudo apt-key adv --keyserver hkp://p80.pool.sks-keyservers.net:80 \
      --recv-keys 58118E89F3A912897C070ADBF76221572C52609D
 
#### Add the official docker repository to your apt
 
    sudo apt-add-repository 'deb https://apt.dockerproject.org/repo ubuntu-xenial main'
 
#### Update 
 
    sudo apt update

#### Install some Docker dependencies
 
    sudo apt install linux-image-generic linux-image-extra-virtual
 
#### Reboot
 
    sudo reboot
 
    sudo apt install docker-engine
 
#### add your user id to the docker group so that permissions are not a problem

    sudo usermod -a -G docker osboxes
    

## Getting Docker Images

To "get" an image, the easiest thing to do is to "pull" it.  **pull** will only download the Image to your system, it will not run it (i.e. it will not instantiate it into a Container).  The **pull** step **is not strictly necessary** - the **docker run** command (below) will also do a **pull** before it attempts to run the Image; however, if you wanted to use an existing Image as the base for creating your own software app and your own Docker Image, you might want to **pull** it without running it.

<pre>


</pre>




In [2]:
docker pull hello-world


Using default tag: latest
latest: Pulling from library/hello-world
Digest: sha256:451ce787d12369c5df2a32c85e5a03d52cbcef6eb3586dd03075f3034f10adcd
Status: Image is up to date for hello-world:latest


## Running Docker Containers

The easiest way to run an image - i.e. create a Container from that Image - is to simply **run** it:


In [3]:
docker run hello-world



Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (amd64)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/get-started/



<pre>


</pre>
To see what Images you have, use the **docker images** command


In [4]:
docker images

REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
hello-world         latest              fce289e99eb9        8 months ago        1.84kB


Look at how small hello-world is!  This is why Docker is becoming more popular than Virtual Machines, or even software installations (I don't have to **install** the hello-world app, I simply **docker run** it!)

Once you are finished with an Image, you can delete it with the **docker rmi** (remove image) command.  Note that you must use the IMAGE ID of the docker Image you want to remove - in this case, the hello-world Image.  However, because we just ran it, we created a Container.  We must first stop the Container.  See the error message when I try to remove the hello-world Image...:


In [5]:
docker rmi fce289e99eb9   # this will be different every time...
# note that the error below tells us the container id

Error response from daemon: conflict: unable to delete fce289e99eb9 (must be forced) - image is being used by stopped container 4ab6fa7cf9dd


: 1

In [12]:
docker rm e816a4da6647   # this will be different every time  (note that it is "rm" (remove) not "rmi" (remove image))

e816a4da6647


In [13]:
docker rmi fce289e99eb9  # now that I have removed the Container, I can remove the Image

Untagged: hello-world:latest
Untagged: hello-world@sha256:451ce787d12369c5df2a32c85e5a03d52cbcef6eb3586dd03075f3034f10adcd
Deleted: sha256:fce289e99eb9bca977dae136fbe2a82b6b7d4c372474c9235adc1741675f587e
Deleted: sha256:af0b15c8625bb1938f1d7b17081031f649fd14e6b233688eea3c5483994a66a3


In [14]:
docker images

REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE


<pre>


</pre>
## Running more complex software

hello-world is not very complex :-)  Docker containers can contain many kinds of functionalities, from web servers to databases to entire operating systems.  Often you will need to configure the app(s) inside of the Container, or give the Container access to certain folders on your computer, or open certain ports into the Container (e.g. so that the world can access your Web server)

This is achieved using **docker run**, with additional arguments.

for example, to "map" a communication port from the host computer to the Container (e.g. port 80 is the normal port for a Web server), you would add:

     -p 80:80    # this means, the host port 80 goes to the Container port 80
     
You can also pass Environment Variables using the '-e' argument

     -e MY_VARIABLE_HERE=some_value

Every Image will have different instructions about how to run it, [so I wont (can't!) explain all of the options here](https://docs.docker.com/engine/reference/commandline/run/#options).  Just read the instructions for the Image you want to use.

The last argument we will learn is "--name".  This is used to give a specific (memorable) name to your Container.  Moreover, it allows you to use the same Image, with different configurations, in different Containers (e.g. you run the **same Web Server Image twice**, one Container runs your Personal Website (--name my-personal-website) and the other one runs your Laboratory Website (--name lab-web-server).  The two Containers can run at the same time! 


## The MySQL Database Server Docker Image

MySQL is a very popular, open-source database system widely used in Bioinformatics.  MySQL can be installed directly onto your computer; however, it can sometimes be a bit difficult to configure it.  Using the Docker Image, it becomes extremely easy!

https://hub.docker.com/search/?isAutomated=0&isOfficial=0&page=1&pullCount=0&q=mysql&starCount=0

The "Official MySQL" Image is the one we want.  Click on it now.

https://hub.docker.com/_/mysql/

### Image tags

There are often many versions of a Docker image.  If you don't care which version you use, there is a tag called "latest" which will give you the latest... **BUT YOU SHOULD ALWAYS CARE!!!**  You cannot do reproducible science if you don't know what version of software you are using, or when the software you are using suddenly changes in the middle of your PhD!

These are the current MySQL Docker Images.

    Supported tags and respective Dockerfile links

    8.0.11, 8.0, 8, latest (8.0/Dockerfile)
    5.7.22, 5.7, 5 (5.7/Dockerfile)
    5.6.40, 5.6 (5.6/Dockerfile)
    5.5.60, 5.5 (5.5/Dockerfile)
    

I know that Version 8.x.x has some problems, so I don't want to use it.  I will use the last version from the v5 series (5.7.22).  The way to refer to a specific version of an Image is to say:   imagename:tag

For example, the Image we want is:

       mysql:5.7.22


You can read the rest of the instructions (lower in the page) by yourself.  Basically, the main argument you need to provide the Container is the password you want to use for the 'root' database user (MYSQL_ROOT_PASSWORD=some_password_here)

One instruction that they (annoyingly!) forget to tell you is that, to access the mysql server inside of the Container, you need to attach port 3306 to the Container port 3306.

The final command line to start the MySQL Server is:

     docker run --name course-mysql -p 3306:3306 -e MYSQL_ROOT_PASSWORD=root -d mysql:5.7.22
    
this runs mysql v5.7.22, with the root password of 'root', with port connections of 3306->3306, and creates a Container named "course-mysql".  It also "detaches" (-d), meaning that after you execute the **docker run** command, the Container will continue to run even if you close the terminal window (it will NOT continue to run after you shut-down the computer!)

Open a terminal window and execute that command now.... then come back to this page



<pre>


</pre>
To see what Containers are running use the **docker ps** command:


In [15]:
docker ps

CONTAINER ID        IMAGE               COMMAND                  CREATED              STATUS              PORTS                    NAMES
f0ff0c64dec2        mysql:5.7.22        "docker-entrypoint.s…"   About a minute ago   Up About a minute   0.0.0.0:3306->3306/tcp   course-mysql


Note that your container is named "course-mysql".  You can now use this container name to quickly start or stop the MySQL server using the **docker stop** and **docker start** commands:


In [16]:
docker stop course-mysql

course-mysql


In [17]:
docker ps

CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES


In [18]:
docker start course-mysql

course-mysql


In [19]:
docker ps

CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                    NAMES
f0ff0c64dec2        mysql:5.7.22        "docker-entrypoint.s…"   3 minutes ago       Up 2 seconds        0.0.0.0:3306->3306/tcp   course-mysql


**NOTE AGAIN** when you **docker start** it starts the Container __with the same configuration that was used to create it__.  This is very useful when you have various configurations of the same software.

<pre>


</pre>

#  This is all we will learn for now.

But there is a lot more to learn about Docker!  