<DIV ALIGN=CENTER>

# Virtualization & Docker    
## Professor Robert J. Brunner
  
</DIV>  
-----
-----

## Virtualization Overview

Virtualization technology is used to simplify the development, deployment, and management of applications.  

Virtualization can take different routes, two of the more popular options are

1. Virtual Machines
2. Virtual Containers

-----


### Virtual Machines

- full operating system
- control complete environment
- isolation
- slow to boot
- heavyweight

-----

### Virtual Box

![Virtualbox Website](images/virtualbox.png)

-----

### Virtual Container

- shared operating system (Linux Containers)
- Mac OSX/Windows require a Linux host OS (VM)
- multiple containers from same image
- lightweight
- fast startup

-----

### Docker Container

![Docker Website](images/docker-website.png)

-----

### VMs versus Containers

The [Docker Website](https://www.docker.com/whatisdocker) provides a comaprison between Virtual Machines and Docker containerization.

-----

### Boot2Docker

Mac OSX and Windows systems require a Linux Guest Operating System in order to run Docker containers. For these operating systems, you can use the Boot2Docker application which provides:

- A VirtualBox installation
- An Ubuntu Guest OS
- The Boot2Docker application

![Boot2Docker Application](images/boot2docker.png)

-----

### Boot2Docker Shell

Running the Boot2Docker application will open a new boot2docker shell as shown below. You can start multiple bot2docker shells by simply re-running the boot2docker application. In this course, we denote a boot2docker shell prompt with the dollar sign, `$`.

![B2D Shell](images/b2d-shell.png)

-----


### Docker Commands

You can see the list of docker commands, shown below, by simply entering docker at a boot2docker shell prompt:

    $ docker

![Docker Commands](images/docker-commands.png)

-----


### Docker Shell

To start a new container from an existing image, you simple enter a
docker run command with the target image name and the desired executable
program. To start an interactive  docker container you simply include
the `-it` flag. Thus, to start a new container in an interactive Unix
Bash shell, we enter the following command:

    $ docker run -it lcdm/standalone-rppdss /bin/bash
   
![Docker Shell](images/docker-shell.png)

Note how the prompt has changed to indicate that we are now running a
shell in the new docker container. The directory is
`/home/data_scientist`, and our prompt has changes to the string
`data_scientist@0399bef31440:~$`, but we will simply use `:~$` to refer
to a docker shell prompt (the `~` character simply means we are in the
home directory of the Docker container). The string of characters
`0399bef31440` is simply the hostname Docker assigned to our running
container, your value will likely be different.

-----

### Docker Container

After running the following command in a different boot2docker shell you will see the original container image and the new running container.

```
    $ docker run -it lcdm/standalone-rppdss /bin/bash
``` 

To see the list of docker images, you enter `docker images` at a boot2docker prompt, while to see the list of running docker containers you enter `docker ps` at the boot2docker prompt.

![Docker Container](images/docker-container.png)

-----



### Isolation

You can view the container isolation by running two boot2docker shells.
In the first shell, we run the docker container, and make a new
directory in the root file system that contains a single, new file.

```
$ docker run -it --name=standalone lcdm/standalone-rppdss /bin/bash
    
:~$ mkdir testing
:~$ cd testing/
:~$ touch README
:~$ ls 
```

![Docker First Container](images/docker-container1.png)

Now, in a second boot2docker shell, we run a new instance of the course
docker image, and see the effects of container isolation:

```
$ docker run -it --name=standalone lcdm/standalone-rppdss /bin/bash
:~$ ls 
```

Since our running containers are, by default, isolated, the changes made
in the first container are not present in the second container.

![Docker Second Container](images/docker-container2.png)

-----

## Persistence

You can save changes made to a running container by using the `docker commit` command. In the previous example, our running container added a new file called README in the /testing directory. We can create a new image from this running container, and use this new image to start a container that includes any changes made in the original running container. 

To do this, we first need to get the list of running containers by using `docker ps` to find the name of the container we want to persist. Next we commit the changes to the running container by using `docker commit`, after which we can start a new container from this saved image, and verify the new container has the original changes.

![Docker Commits](images/docker-commit.png)

-----

## Running the course Docker image

Once Docker has been properly setup, you can pull the course Docker
image from the DockerHub registry by issuing a `docker pull` command:

```
$ docker pull lcdm/standalone-rppdss
```

In this command, we indicate that we wish to pull the
`standalone-rppdss` image from the LCDM repository on the Docker Hub
registry website. If you wish to see how this image was constructed, you
can look at the dockerfile for this course, located in the
docker/standalone directory of this course's github repository. Note
that I also cloned the course github repository to my local computer:

```
$ mkdir github
$ cd github
$ git clone https://github.com/ProfessorBrunner/rp-pdss15
```

Once the image has been downloaded to your local computer, you can start
the container running so that it will function as a local JupyterHub
server by issuing the following Docker command:

```
$ docker run --net host -d -p 8888:8888 --name standalone -v
/Users/rb/github/rp-pdss15:/home/data_scientist/rp-pdss15
lcdm/standalone-rppdss
```

This command creates a running Docker container from the
lcdm/standalone-rppdss Docker image, instructs the container to listen
for connections on a web port, names the running container `standalone`
and maps a local directory (in this case `/Users/rb/github/rp-pdss15`)
to a shared directory in the running container, located at
`/home/data_scientist/rp-pdss15`. You should change this directory to a
suitable local directory on your computer. The shared folder allows you
to persist any changes made in the running Docker container on your
local filesystem.

You can connect into this running Docker container by issuing a `docker
exec` command:

```
$ docker exec -it standalone /bin/bash
```

----

## Connecting to a local Jupyter Server

Once the course Docker image is running on your local computer, you can
connect to the Jupyter Server by opening a web browser to the [default
URL](http://192.168.59.103:8888). If succesful, you should see the
directory listing (which might change, depending on what is in your
shared folder).

![Jupyter Server](images/jupyter-server.png)

At this point, you can navigate the directory structure of our course
github repository to find and run the notebooks of interest.

-----

### Additional References

1. The [Docker Website](https://www.docker.com)
2. The [VirtualBox Website](https://www.virtualbox.org)
3. (Advanced) The [Docker Book](http://dockerbook.com)
4. (Advanced, Mac OSX) The [Docker: Missing Manual](http://viget.com/extend/how-to-use-docker-on-os-x-the-missing-guide)

-----

### Return to the [Index](../index.ipynb) page.

-----