## Lighthouse Labs - Synaptive Medical

### W5D5 Containerization

Instructor: Socorro Dominguez  
December 18, 2020

**Agenda:**
- What are containers? (10 min)
- Why do we need them? (5 min)
- Docker - how it works (10 min)
- Docker for Data Scientists (5 min)

## Documenting and loading dependencies

You've made some beautiful data analysis pipeline/project using Python. It runs on your machine, but how easily can you, or someone else, get it working on theirs? The answer usually is, it depends...

What does it depend on?

- Does your README and your scripts make it blatantly obvious what programs and packages need to run your data analysis pipeline/project? 
    

- Do you also document the version numbers of the programs and packages you used? This can have big consequences when it comes to replicability...

## How do I find out what packages and versions I am using?

### In Python 

From the command line, `pip freeze`, will list all installed packages and their versions.

To find the version of Python, from inside Python type:
```
import sys
print (sys.version)
```

## Tools for managing software and package dependencies

### Language agnostic (polyglot - can be used with more than Python):

- Docker
- conda

## What is Docker?

*"Docker containers wrap a piece of software in a complete filesystem that contains everything needed to run: code, runtime, system tools, system libraries – anything that can be installed on a server. This guarantees that the software will always run the same, regardless of its environment."*

*Source: https://www.docker.com/what-docker*

## Container versus Virtual Machines

*Key takehome - Docker shares the host's operating system, whereas virtual machines have a completely separate, additional operating system. This can make Docker lighter (smaller in terms of size) and faster than using a virtual machine.*

![alt tag](img/container_v_vm.png)
*Source: https://www.docker.com/resources/what-container*


## Motivating example - Hello Docker!

We can use the tool [hellodocker](https://github.com/docker-library/docs/tree/master/hello-world) to make sure you have Docker installed properly.

**Problem** - [hellodocker](https://github.com/docker-library/docs/tree/master/hello-world) is pretty easy to install. But some other packages such as Keras are hard to install (especially on Windows...)

**Solution** - let's use a docker container -we will do a walkthrough on this - 


### Let's do it!

### Step 0 - Install Docker (do this once)
- Follow the [instructions in the MDS installation guide](https://ubc-mds.github.io/resources_pages/installation_instructions/) to do this.

### Step 1 - launch the Docker app (for OSX & Windows only)
- Use launchpad/Finder/Start menu/etc to find and launch Docker

### Step 2 - get makefile2graph image from Dockerhub
- open command line (terminal/GitBash)
- type: `docker run hello-world`
- verify that it successfully pulled by typing: `docker images`, you should see something like:
```
REPOSITORY                TAG                 IMAGE ID            CREATED             SIZE
hello-world              latest              bf756fb1ae65         11 months ago       13.3kB
```

## Wait! Where did this come from? [Docker Hub](https://hub.docker.com/)!


- [Docker Hub](https://hub.docker.com/) is like GitHub but just for Docker images. 
- So what you downloaded was a Docker image that lives in this repository: https://hub.docker.com/_/hello-world


### Step 3 - launch a container from the image and poke around!

- type: `docker run -it hello-world`
- If it worked, then your command line prompt should now look something like this:

```
root@ad0560c5b81a:/# 
```
- use `ls`, `cd`, `pwd` and explore the container
- type `exit` to leave when you are done (your prompt will look normal again)!

### Step 4 - clean up your container!

- After you close a container it still "hangs" around... 
- View any existing containers using `docker ps -a`
- Kill the container by typing `docker rm <container_id>`
- Prove to yourself that the container is no longer "hanging around" via `docker ps -a`, but that you still have the image installed (via `docker images`)

### That's a lot of work...

- We can tell Docker to delete the container upon exit using the `--rm` flag in the run command.
- Type the command below to run the container again, exit it and prove to yourself that the container was deleted (but not the image!):

```
docker run -it --rm hello-world
```

### Image vs container?

Analogy: The program Chrome is like a Docker image, whereas a Chrome window is like a Docker container.

<img src="img/instance_analogy.png" width="600" align="left"/>


### Step 5 - connect the container to your hard drive

To run the certain images on a file that exists on our laptop we will need to mount at least part of our filesystem as a volume on the Docker container. We do that with `-v <absolute path on laptop>:<relative path from container home directory>`.


_**notes:**_
- _**Windows machines might need to explicitly share drives with Docker, see [here](http://peterjohnlightfoot.com/docker-for-windows-on-hyper-v-fix-the-host-volume-sharing-issue/) for how to do so.**_
- _**Windows machine might have to use Windows filepaths...**_

## Debrief - what did we just do? 

Let's now dig deeper into the commands and tools we just used to become more familiar with Docker! Let's fill in this table to explain what each part of our Docker commands did:

| command/flag | What it does          | 
|--------------|-----------------------|
| `pull`       | Downloads a Docker image from Docker Hub |
| `images`     | Tells you what images are installed on your machine                     |
| `run`        | Launches a Docker container from an image                      |
| `-it`        | Tells Docker to run the container interactively                      |
| `--rm`       | Makes a container ephemeral (deletes it upon exit)                      |
| `-v`         | Mounts a volume of your laptop to the Docker container                     |
| `exit`       | Exits a Docker container                      |
| `-p`         | Connects Container to a port on your machine so that you can run a program through a web browser |

You will have now 15 minutes to make sure that you have Docker installed and that it is running.  
Try running `Hello-world`

If your install is completed, read the following [article](https://www.freecodecamp.org/news/a-beginner-friendly-introduction-to-containers-vms-and-docker-79a9e3e119b/)   
We will do a walkthrough afterwards.

Instructions: For Keras Tensorflow Walkthrough
1. Make sure all other Jupyter Notebooks/Labs are closed.
2. Go to terminal and for the first time do `docker pull jupyter/tensorflow-notebook`
3. Run the following command: `docker run --rm -p 8888:8888 -v <your path>:/home/jovyan/work/ jupyter/tensorflow-notebook`, for example `docker run --rm -p 8888:8888 -v /Users/seiryu8808/Desktop/Lighthouse/synaptive_lighthouse_workshop/W5D5/:/home/jovyan/work/ jupyter/tensorflow-notebook`