# Docker Basics

## Agenda

* What is containerization?
* Why we need containerization?
* What is docker?
* Installing Docker (Linux, Mac only)
* Dockerfile Jupyter example - creating and running

## What is containerization ?


* Containers are programs with all dependencies installed. (like those of windows, where you click Next)
* Like Virtual-Machines, but lighter.

![cont-1](https://www.panda3d.org/manual/images/9/98/Install-1.jpg)

## Why would you need containers?

* Analysis with Jupyter notebook which you want to share with your friends, so they can tinker around with it and run it themselves.

![need-1](https://sigmoidal.io/wp-content/uploads/2017/03/jupyter_notebook-spectral1.png)



## Issues

* First they will have to setup/install correct libraries (specific versions)
* Everyone might be on different OS - different instructions for all (Linux, Windows, Mac)
* Each person had to do that by himself

## Containers Solve these Issues

* Software Dependency Resolution (OS, Libraries)
* No Setup Time

## What is Docker?

* Company which provides this containerization technology - a synonym for containerization

![docker-1](https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcSPcxNTEBP0Dd-5Gv6wWA6TYalicfejK1c7Y-07FiYOx-G8-SyQNQ)

* Other compaies do exist : 
    * rkt (aka rocket)
    * Mesos Containers
    * Windows Server Containers

## Installing Docker

### Ubuntu 

In [None]:
sudo apt-get update
# install
sudo apt install docker.io 
# enable docker when system boots up
sudo systemctl start docker
sudo systemctl enable docker
# check version
docker --version

### MAC

Go to this link and download the stable version

In [None]:
https://docs.docker.com/v17.12/docker-for-mac/install/#download-docker-for-mac

## Demo Create containers

In [None]:
docker pull ubuntu

![doc1](./images/pic1.png)

In [None]:
docker images

![pic4](./images/pic4.png)

In [None]:
docker run -it ubuntu bash

![pic2](./images/pic2.png)

In [None]:
docker ps

![pic3](./images/pic3.png)

Try out different docker images

* docker pull python
* docker pull postgres
* docker pull 

## Dockerfiles

Dockerfile (text file) create docker images which we run as containers

![pic5](./images/pic5.png)

## Lets Create our Own Jupyter Dockerfile

Create a file called **Dockerfile** and paste the following lines,

In [None]:
FROM ubuntu

RUN apt-get update
RUN apt-get upgrade -y

RUN apt-get install -y python3 python3-dev python3-pip

Create the image using,

In [None]:
docker build . -t my_python_image

Check if the image is built using,

In [None]:
docker images

## Now we can install all our popular python packages

Rebuild the image using the same command as previous time

In [None]:
FROM ubuntu

RUN apt-get update
RUN apt-get upgrade -y

RUN apt-get install -y python3 python3-dev python3-pip

RUN pip3 install jupyter pandas numpy scipy

You will notice that only the new line is run. All previous instructions are loaded from cache.
So only new changes are run henceforth.

![pic6](./images/pic6.png)

## Lets run the Docker container with jupyter notebook

Add the entrypoint command and rebuild the image

In [None]:
FROM ubuntu

RUN apt-get update
RUN apt-get upgrade -y

RUN apt-get install -y python3 python3-dev python3-pip

RUN pip3 install jupyter numpy

ENTRYPOINT ["jupyter", "notebook", "--ip=0.0.0.0" , "--allow-root", "--port=8889"]

## Run the container

We have mapped the port 8889 which is on the contaier to port 8889 on the host

In [None]:
docker run -p HOST_PORT:CONTAINER_PORT image-name

In [None]:
docker run -p 8889:8889 my_python_image

![pic8](./images/pic8.png)

## Persisting Data

If we want to save our files, we will attach a folder from our computer to another folder inside the docker container. Create a directory called  **my-data** in your computer, we will store all our jupyter notebook analysis in here.

In [None]:
docker run -v /HOST/DIRECTORY/FULL/PATH:/container/directory image-name

In [None]:
docker run -p 8889:8889 -v "$(pwd)/my-data/":/home/ my_python_image

![pic7](./images/pic7.png)