# Intro to Docker

<p align=center><a href=https://www.docker.com><img src=images/Docker_Logo.png width=400></a></p>

## Introduction


You have probably encountered an issue where the main problem is the compatibiity between the application you are running and your Operating System, or the installed packages. Wouldn't it be nice if all applications have a common ground that can be executed in any operating sytem? 

The solution is [Docker](https://www.docker.com), a tool that <i>container</i>ises your application so that it can ran in any environment. Containerising is the process of using containers that hold your application and all your dependencies in a single environment.

This idea was an extension of LXC (Linux Containers) and their usage, and __revolutionized the way we deploy software__.

__What can I use Docker for?__
- Standardized development environment across many workstations
- Consistent deployment of applications
- Full reproducibility (e.g. for research)

### Intro to Containers and Images

But wait... What is a container? You can think of a container as a Virtual Machine (VM), however, whilst a VM virtualises the hardware (how much RAM, memory you have), a container only virtualises the Operating System (OS). However, Docker doesn't just make a copy of the OS you want to work with, it will provide the necessary tools that will make the specific OS run. 

So, let's say for example that your application runs in the latest ubuntu version. Docker will NOT install the latest version each time you need to run your application, it will simply get the tools necessary to run that version without installing a whole OS. 

But, you might be wondering that even the tools might need a time to be installed right? That's correct, but the good thing about Docker is that it install them only once (unless you unistall it), and next time you want to run the same application, Docker will know where these tools have been installed. To do so, Docker relies on a powerful tool named Docker images, which are the templates to run containers. 

You can think of Docker images as Python classes, and containers as the Python instances. The image is the blueprint that indicates all the steps needed to run the container in which your application is held. The first time you `build` the image, those tools you need to run your application will be downloaded and installed, and Docker will know how to access those tools when you run you application again. 

One nifty trick Docker does to be more efficient is that those tools can be shared between images, so if you create another image where those tools are used again, Docker won't download or install them, because it knows how to find them.

> <font size=+1>Docker allows us to package our code, apps etc. with all the necessary dependencies in a self-contained environment called a `Docker image`. We can then create instances of these images, which we call `Docker containers` </font>

### Why are containers so important?

Now that you know what a container is, we can have a summary of why they are so important (and powerful)

Let's see how containers changed the deployment landscape:

<p align=center><img src=images/container_evolution.svg width=800></p>

When deploying a regular application, the application is quite lightweight because we just deploy the code for running the applcation, but we don't (usually) have in mind the operating system that application. If we wanted, say for example, deploy the whole virtual machine, we would need to install and launch the whole operating system everytime we want to run the application. Containers takes the best of both worlds and combines the lightweightness of deploying an application the traditional way with the fact that we can take into account that it can run in multiple applications.

__Benefits__:
- Containers are more lightweight than VMs
- They are __immutable__, meaning that their content won't change once we create the image
- You can easily create or destroy a container whenever you need it.
- All your application's necessities can be packed within the container
- Containers are reproducible since they run the same everywhere, regardless of the host OS.
- __Micro-services__ - app can be broken into multiple separate containers communicating with each other:
    - We only change image we need
    - No need to deploy everything a-new
    - Easily switchable components 

## Setting up Docker


Important! Before starting the rest of the notebook, it is recommended to use VSCode in your local machine, since the files you will run need the Docker engine to containerise your applications locally.

When creating a Docker image, your computer will need to create containers that, even though they are not VMs, they still need to allocate a memory slot (it's more complicated than `memory slots`, but for simplicity reasons, let's leave it like this) that will run certain tools that your OS might not support. 

In order to grant access to those memory slots, your computer needs to start an engine that creates the containers within your computer. Each OS has different ways to do so, for example, for Windows and iOS, you need to install Docker Desktop that will take care of creating the corresponding engine that will create the containers. On the other hand, Linux distributions are much more flexible, and you can run Docker just by installing the engine to create the containers.

Here we are going to see how to install Docker for each OS. 

<details>
    <summary> <font size=+2> For Linux Users </font> </summary>

To use Docker on Linux, you need to install Docker engine, which is the core technology of Docker. To insall it, go to the following webpage and follow the [instructions](https://docs.docker.com/engine/install/centos/). There, you will find how to install Docker based on your distribution.

If you are using Ubuntu, you can simply go to this [website](https://docs.docker.com/engine/install/ubuntu/). Make sure you follow the instructions that installs Docker engine `Using the repository`

<p align=center> <img src=images/Docker_Engine.png width=400> </p>

</details>

<details>
    <summary> <font size=+2>  For Mac Users and any Windows Users that doesn't have Home Edition </font> </summary>

In order to use Docker on Mac, you need to install Docker Desktop. For both Mac and Windows, you can download it from this [website](https://docs.docker.com/desktop/)

<p align=center> <img src=images/Docker_install.png width=400> </p>

After that, select the operating system you are using. Remember that if you are using Windows Home Edition, you need to refer to the next section (in this notebook) to complete the installation.

</details>

<details>
    <summary> <font size=+2>  For Windows Home Edition Users </font> </summary>

Unfortunatelly, Windows Home edition is the worst for installing anything that requires accessing your kernel. Luckily, there is (complicated) workaround this. You need to install the Hyper-V-Enabler, which grants permission to Docker for accessing your kernel. To download it, click this [link](https://aicore-files.s3.amazonaws.com/Foundations/DevOps/Hyper-V-Enabler.bat)

Then, in the Windows searchbar, look for 'Turn Windows Features On or Off', and enable the following:
- Hyper-V Management Tools
- Hyper-V Platform
- Windows Hypervisor Platform
- Windows Subsystem for Linux

<p align=center> <img src=images/Docker_Home_Edition.png width=400> </p>

You also might need to install the latest version of WSL. So you can install it using the following [file](https://aicore-files.s3.amazonaws.com/Foundations/DevOps/wsl_update_x64.msi)

After that, follow the instructions included in the `For Mac Users and any Windows Users that doesn't have Home Edition`
</details>

For all versions, once you installed it, make sure the installation went fine by running `docker --version` in your CLI

### Docker Hub

So, we have been talkin about containers and how they set up a common ground for all applications. Many other users have the images to create those containers available for the rest of the world, so everytime you need to run an application, you don't have to do it from scratch. 

> <font size=+1> Docker Hub is a repository that stores Docker images from users all across the world </font>

You can think of Docker Hub as the GitHub for repositories, or the Pypi for Python packages. 

We will get more in depth with Docker Hub, but for now, let's create an account that we will eventually need to upload our images.

First, go to the [Docker Hub website](https://hub.docker.com) and create an account

<p align=center> <img src=images/Docker_Hub.png width=600> </p>

We will go back to DockerHub later in this notebook, and you will need to sign in through your terminal, so make sure you remember the password you used!

For now, one thing to bear in mind: you are going to use base images to create your own Docker images, and those base images are stored in Docker Hub.

## Docker Images and Dockerfile


> <font size=+1>Docker image are the instructions needed to create an instance of the Docker container</font>

Thus, Docker images are essentially a set of steps that the Docker engine will take to create the environment we need to run our application. Those steps are declared in a file named `Dockerfile`, which is a special type of file that Docker will look for whenever you want to build an image. 

`Dockerfile` doesn't contain any extension, the name of the file is literally `Dockerfile`, but you might use it as the extension. For example, if the Dockerfile specifies the steps to create an image for an API image, you can call it `api.Dockerfile`

In VSCode, when you create a Dockerfile, it will automatically recognize it as a Dockerfile, and you will notice thanks to the characteristic whale icon.

<p align=center> <img src=images/Docker_icon.png width=200> </p>

Once you create the Dockerfile, you can start containerising your application, but of course, you need to specify the commands you want Docker to run. 

Thus, let's take a look at what you can do inside a Dockerfile by looking at an example. 

Dockerfiles will contain instructions, such as `FROM`, `RUN`, `CMD`, `COPY`... The capitalised words starting each line in the Dockerfile are called __instructions__ and are basically commands, followed by arguments (like in terminal), which the docker build command knows how to execute. Docker build runs each of these instructions in turn to create the docker image.

### Your first Docker image

In this example you will create a Docker image that runs the celebrity_births scraper you can find in the `Software Design and Testing` module. In case you haven't completed that part yet (or you forgot where you put the file), you can download the files [here](https://aicore-files.s3.amazonaws.com/Foundations/DevOps/celebrity_example.zip)

After you download the file, `cd` to that folder and create the Dockerfile. Call the file `Dockerfile` and let's dive into it.

Inside the Dockerfile start writing this:

```Dockerfile
FROM python:3.8-slim-buster
```

Usually, Docker images are built from a pre-built image Docker can find on Docker Hub. The pre-built image usually contains some dependencies. For example, a common one is to use an image with Python installed. You can download and run the pre-built image using the `FROM` clause as we see above. 

So, with the first command we added, we will start creating our image with the necessary Python dependencies.

The next thing we need is to install or copy what we want inside the docker container. Remember that the directory you add is relative to the position Dockerfile is.

In your Dockerfile add the following line

```
COPY . . 
```

This will copy everything in the Dockerfile directory (`requirements.txt` and the `scraper` folder) inside the container.

It's very important to understand this step. When you build the image, you are going to copy your files inside the container, and it will be like they will be in another computer. Think about the container as a separate computer where you will copy the files. So at this point, it will be like having another mini computer with Python installed and your scraper in it

One more thing you need before running the scraper is installing your python packages, like beautifulsoup and requests. Luckily, we also copied the requirements file into the image, so we can run it directly.

```
RUN pip install -r requirements.txt
```

We are almost there! The only thing left to do is running the python script. We can't use the `RUN` clause here, because `RUN` is executed when the image is built. We need a command that is executed when we run the image, and that clause is `CMD`

```
CMD ["python", "scraper/celebrity_scraper.py"]
```

This clause has many ways to be declared, in this case, we are using square brackets, and the first item is the executable (`python`) and the rest of items are the parameters (files)

Now we are ready to build the image! In your command line interface, if you are not in the `celebrity_example` directory, move into it. Then, we need to use the `build` command from Docker. It has the following syntax:

`docker build [OPTIONS] [Dockerfile path]` 

You can check the options [here](https://docs.docker.com/engine/reference/commandline/build/). One of the common options you may want to use is the -t flag, to give a `tag` to our image. That way, the image will have a name. 

Since we are in the same directory as the Dockerfile, the Dockerfile path is simply a dot (`.`)

Run the following command in your command line

In [None]:
docker build -t celebrities:latest .

The `latest` after the name of our image is the tag (or the version if you prefer)

We have just created our first image! Let's check if the image has been properly created by running the following cell

In [None]:
docker images # show our current images on this machine

<p align=center> <img src=images/Docker_images.png width=600> </p>

Ok, so we have just built an image, let's run it. We can run an image using the following syntax:

`docker run [OPTIONS] IMAGE[:TAG|@DIGEST] [COMMAND] [ARG...]`

Let's try running our celebrities image running the following command:

`docker run celebrities`

This will throw an error, because the script expects an input, but right now, the image is running in a non-interactive mode. To change that, we need to add the options -i and -t. `-i` will keep the STDIN open, and `-t` will make the process interactive

<p align=center> <img src=images/Docker_run_error.png width=600> </p>

`docker run -it celebrities`

<p align=center> <img src=images/Docker_run.png width=600> </p>

Success! you can use this image everywhere now, regardless of the OS and dependencies installed in it. But how do you distribute it? Using Docker Hub!

You have already created an account on Docker Hub, so now you can go to your command line and run:

```
docker login
```

Then enter your credentials

The images you push to docker hub needs to have a specific name: 
```
<username>/<image_name>:<tag>
```
So, let's create a copy of the existing image with a new name. We can use the docker tag command to do so. The syntax is as follow:

```
docker tag <Image_Id> <New name>
```

Image_id can be seen when you run `docker image` (See picture above), let's run the following command. In my case, my username is ivanyingx, so change that with your username, and the Image_id is 82a51cbd4876:

```
docker tag 82a51cbd4876 ivanyingx/celebrities:v1
```

Then, you can check that the image has been properly created by running docker images again

<p align=center> <img src=images/Docker_tag.png width=600> </p>

By the way, you can also check this information in the Docker Desktop if you are on Mac or Windows:

<p align=center> <img src=images/Docker_Desktop.png width=600> </p>


Finally, it's time to push the image to Docker Hub. Pushing an image is very similar to pushing a repository to github, simply use docker push!

`docker push ivanyingx/celebrities:v1`

You can check that your image has been uploaded by going to your docker hub account:

<p align=center> <img src=images/Docker_Hub_example.png width=600> </p>


If any user wants to run your container, they can do it directly running it on their local machines. For example, in this case, you can run _my_ image by running:

`docker pull ivanyingx/celebrities` This will download the image

and then 

`docker run ivanyingx/celebrities` This will run the image

You can also run directly `docker run ivanyingx/celebrities`, which will perform both operations.

Congratulations! You have created and pushed your first Docker image! The rest of your Docker path will consist on practicing (a lot!). The rest of the notebook dives deeper in the concepts we have seen so far, adding some commands you might find useful in the Dockerfile

## Summary

What we've learned:
- What Docker is and how it can help us distribute our application across multiple operating systems. 
- Now know what Docker Hub is and how i can be useful to get prebuilt images containers. 
- What a Docker image is and how to build it using a Docker file.
- Learned what a Docker container is and how to run a container from the command line.