# Introduction to Docker

<p align=center><a href=https://www.docker.com><img src=images/Docker_Logo.png width=400></a></p>

Occasionally, programmers encounter the issue of compatibility between the application being run and the OS or the installed packages.
The solution to this issue is [Docker](https://www.docker.com), a containerization platform that enables developers to package applications and their dependencies into lightweight, portable *containers*. These containers can run consistently across various environments, from a developer's laptop to a production server. 

## Docker Advantages

Docker is vital in the world of software development for many reasons:

- **Consistency**: Docker provides a consistent environment for applications, ensuring that what runs on a developer's machine will run the same way in production. This consistency reduces the "it works on my machine" problem, which is a common source of frustration in development teams.

- **Isolation**: Docker containers encapsulate applications and their dependencies, isolating them from the host system and other containers. This isolation promotes security and prevents conflicts between applications and libraries.

- **Portability**: Docker containers are highly portable. You can create a container on one machine and run it on any other machine with Docker installed, regardless of the underlying infrastructure. This portability simplifies the deployment process.

- **Efficiency**: Docker containers are lightweight and share the host OS kernel, which makes them efficient in terms of resource utilization. This efficiency allows you to run more containers on the same hardware, optimizing resource allocation.

## Key Docker Components

Docker is composed of several key components, each with its unique role in the containerisation process. These components work together seamlessly to enable the creation, management, and deployment of containers.

### 1. Docker Images

*Docker images* are at the heart of containerization, serving as the building blocks of containers. They are like blueprint templates for container. An image contains everything needed to run an application including:

- The application's code and files
- Libraries and dependencies required for the application
- The operating system or a minimal OS subset

Some key aspects to understand about Docker images are:

- Images are immutable, meaning they cannot be changed once created. To modify an image, you create a new image based on an existing one.
- Images are typically based on a specific Linux distribution or a parent image. This parent image forms the foundation upon which your application and its dependencies are layered.
- Docker images are stored in a layered format, where each layer represents a set of changes or instructions. These layers can be shared and reused, promoting efficiency and reducing storage space.

We will learn how to create Docker images in the next section.

### 2. Docker Containers

Docker containers are runnable instances created from Docker images. They encapsulate an application and its runtime environment, ensuring that it runs consistently across different systems. 

A container is similar to a virtual machine (VM), but while a VM virtualises the hardware, a container only virtualises the OS. Note that Docker does not make a copy of the OS you want to work with, rather it provides the necessary tools for working with that specific OS. For example, if an application runs the latest Ubuntu version, Docker will not install the latest Ubuntu version each time the application runs. Instead, it obtains the necessary tools to run that version without installing the entire OS.

Some key aspects about Docker containers are:

- Containers are isolated from each other and from the host system, allowing multiple applications to run independently without interfering with one another
- Containers can be easily started, stopped, and deleted, providing flexibility and scalability to your applications
- The runtime environment inside a container is defined by the image from which it was created, ensuring consistency between development, testing, and production environments

### 3. Docker Registries

*Docker registries* are repositories that store and distribute Docker images. They are essentials for sharing and collaborating on containerized applications. 

Public registries like *Docker Hub* host a vast collection of pre-built Docker images that you can use as a starting point for your projects. On the other hand, private registries enable organizations to store and manage their custom images securely, ensuring that sensitive information is not exposed publicly.

As the first step in the learning process, go to the [Docker Hub website](https://hub.docker.com), and create an account.

<p align=center> <img src=images/Docker_Hub.png width=800 height=400> </p>

Later in this lesson, we will revisit DockerHub, but we will also extensively cover Docker registries generally in another lesson.

### 4. Docker Volumes

*Docker volumes* are used for managing data persistence between containers and the host system. They play a crucial role in scenarios where you need to preserve data, such as databases or application state. Here's what you should know about Docker volumes:

- Volumes are separate from container file systems, allowing data to persist even if the container is removed
- They can be mounted into containers, enabling data sharing and synchronization between containers and the host
- Volumes are particularly valuable for database containers, where data durability and persistence are critical

We will cover Docker volumes in more detail in another lesson/

## Setting up Docker

> Important! Before proceeding, we recommended that you use VSCode on your local machine since the files you'll run require the Docker engine to containerize applications locally.

### Installing Docker on Linux

For Linux users, installing Docker involves setting up the Docker engine, which forms the core of Docker technology.

To install it visit the official Docker documentation for Linux installation and follow the proposed [instructions](https://docs.docker.com/engine/install/centos/). There, you will find the steps for installing Docker depending on your distribution.

If you are using Ubuntu, you can simply visit this [website](https://docs.docker.com/engine/install/ubuntu/). Please adhere to the instructions for installing the Docker engine `using the repository`.

<p align=center> <img src=images/Docker_Engine.png width=800 height=500> </p>

### Installing Docker on macOS and Windows (Non-Home Edition) 

For macOS and Windows users without the Home edition, *Docker Desktop is the recommended installation method. You can download the application from this [website](https://docs.docker.com/desktop/).

<p align=center> <img src=images/Docker_install.png width=900 height=500> </p>

Download Docker Desktop for your OS. After the download is complete, install Docker Desktop by following the on-screen instructions. Once installed, you can launch Docker Desktop from you system tray or menu bar.

### Installing Docker on Windows Home Edition

Unfortunately, the Windows Home edition is highly unsuitable for installing any application that requires kernel access. So, if you are using Windows Home edition, you'll need to enable Hyper-V and use the Hyper-V Enable to grant access to the kernel. You can download the Hyper-V Enabler [here](https://aicore-files.s3.amazonaws.com/Foundations/DevOps/Hyper-V-Enabler.bat).

Afterwards, in the Windows search bar, type 'Turn Windows Features On or Off' and subsequently **enable** the following:

- Hyper-V management tools
- Hyper-V platform
- Windows hypervisor platform
- Windows subsystem for Linux

<p align=center> <img src=images/Docker_Home_Edition.png width=400> </p>

You may also need to install the latest version of WSL. Install it using the following [file](https://aicore-files.s3.amazonaws.com/Foundations/DevOps/wsl_update_x64.msi).

After enabling these features and installing WSL, follow the instructions provided in the **Installing Docker on macOS and Windows (Non-Home Edition)** section above to complete the Docker Desktop installation.

> Once Docker is successfully installed, you can verify its installation by opening a terminal and running the following command: `docker --version`(this applies to all OSs).

## Dockerfiles and Container Configuration

*Dockerfiles* are text-based configuration files used to specify how a Docker image should be built. They play a crucial role in creating consistent and reproducible container environment. A Dockerfile contains a series of instructions that define the image's base, environment setup, and application code and dependencies.

### Structure of a Dockerfile

A typical Dockerfile follows a structured format:

- *Base Image*: This is the starting point for your Docker image, often based on an existing image from a registry like Docker Hub

- *Instructions*: Dockerfiles consist of a series of instructions that specify how the image should be configured and what should be included in it. These instructions include actions like installing software, copying files, and configuring environment variables.

- *Commands*: Shell commands are used to execute actions during the image build process. These commands can be used for tasks like installing packages, setting up configurations, or running scripts.


> Dockerfiles do not contain any extension. The name of the file is literally `Dockerfile`. But an extension might be used, for example, if the Dockerfile specifies the steps for creating an image for an API image, it can be called `api.Dockerfile`.

When a Dockerfile is created in VSCode, it will automatically be recognised as a Dockerfile, as indicated by the characteristic whale icon.

<p align=center> <img src=images/Docker_icon.png width=200> </p>

## Hands-On: Creating a Dockerfile

In this example, you will create a Docker image that runs the `celebrity_births` web scraper. You can download the necessary files for running this scraper [here](https://aicore-files.s3.amazonaws.com/Foundations/DevOps/celebrity_example.zip).

After downloading the file, `cd` to that folder, and create a Dockerfile named `Dockerfile`. Inside the Dockerfile, write the following: 

```Dockerfile
FROM python:3.8-slim-buster
```

> Every Docker images start with a base image. This is the foundation upon which your image will be built.

Conventionally, Docker images are built from a pre-built image Docker that can be found on Docker Hub. The pre-built image usually contains some dependencies. A common use case is to use an image with Python installed. You can download and run the pre-built image using the `FROM` clause, as indicated above. 

Thus, with the first added command, we begin creating the image with the necessary Python dependencies.

Dockerfiles then consist of a series of instructions that specify how the image should be configure and what should be included in it. These instructions include actions like installing software, copying files, setting environment variables and more. 

In our example, we will continue by adding the following line to our Dockerfile:

```
COPY . . 
```

This will copy everything in the Dockerfile directory (`requirements.txt` and the `scraper` folder) into the container.

> Understanding this step is extremely important. When an image is built, the relevant files are copied into the container, which is analogous to copying them into a different and separate computer. In other words, it is almost as if there is a separate mini computer containing the scraper, with Python installed.

The first `.` argument following the `COPY` instruction is the location of the assets **on your machine** that you wish to copy. The second `.` argument following the `COPY` instruction is the location where the assets will be copied to **on the Docker container**. 

As the final step before running the scraper, your Python packages must be installed, e.g. `beautifulsoup` and `requests`. Fortunately, the requirements file was also copied into the image. Thus, the packages can be installed directly using the `RUN` command, followed by the bash command:

```
RUN pip install -r requirements.txt
```

Now, we can run the Python script. Note that the `RUN` clause is unsuitable here because `RUN` is executed when the image is built. This is where you perform actions like installing software, setting up configurations, and adding files to the image. It affects the image's content but doesn't dictate what happens when a container is started from the image. 

On the other hand, the `CMD` instruction is sued to specify the default command that should be executed when a container is run from the image. In essence, it determines the container's behaviour when it starts:

```
CMD ["python", "scraper/celebrity_scraper.py"]
```

The `CMD` clause can be declared in many ways. In this case, we employ square brackets, and the first item is the executable (`python`), while the rest are the parameters (files). We will discuss in more detail different Dockerfile instructions in a later lesson.

### Best Practices for Dockerfile Creation

To wrap up this hands-on it's important to consider best practices when creating Dockerfiles, as these ensure efficiency, security and maintainability of your application. Some key best practices include:

- **Use Official Base Images**: Whenever possible, start with official base images provided by the software's maintainers (e.g., Node.js, Python, Nginx) to ensure security and reliability

- **Minimize Layer Count**: Limit the number of layers (a layer represent a set of changes) in your image to reduce image size and improve build and push/pull times

- **Clean Up**: Remove unnecessary files and dependencies in the same Dockerfile instruction to minimize the image size

- **Security**: Ensure your Dockerfile and image follow security best practices, such as not running as root and using trusted sources for software installation

- **Documentation**: Include comments and labels in your Dockerfile to document the image's purpose, maintainer, and version

## Hands-On: Building Docker Images from Dockerfiles

To create a Docker image, you use the `docker build` command. This command reads the instructions in a Dockerfile and executes them step by step to construct the image. Each instruction in the Dockerfile results in a new image layer. Therefore, Docker images are constructed from multiple layers, and each layer represents a set of changes or instructions from the Dockerfile.

The basic syntax for building an image is as follows: `docker build -t <image_name>:<tag> <path_to_Dockerfile_directory>`.

- `-t`: Specifies the name and optional tag for the image. Tags provide a way to version your images (e.g., my-app:1.0).

- `<image_name>`: The name you want to give to your Docker image

- `<tag>`: An optional tag for versioning your image

- `<path_to_Dockerfile_directory>`: The directory where your Dockerfile is located

You can view all the options [here](https://docs.docker.com/engine/reference/commandline/build/).

For example, to build an image named `my-app` with the tag `v1.0` from a Dockerfile in the current directory, you would run: `docker build -t my-app:v1.0`.

> The typical naming convention for Docker images is `image_name:version`. Typically we specify the version as `latest` rather than manually writing out the semantic versioning label.

Goign back to our previous scraper hands-on, to build the Docker image from the previously created Dockerfile, in the CLI, change the directory to `celebrity_example`. Then use the `build` command from Docker following the syntax:

In [None]:
docker build -t celebrities:latest .

Since we are in the same directory as the Dockerfile, the Dockerfile path is simply a dot (`.`). To verify if the image was successfully created, you can run the following command:

In [None]:
docker images # show our current images on this machine

<p align=center> <img src=images/Docker_images.png width=600> </p>

## Hands-On: Running Docker Containers

Running a Docker container is straightforward using the `docker run` command. Here's the basic syntax:

`docker run [OPTIONS] IMAGE[:TAG|@DIGEST] [COMMAND] [ARG...]`

To run the `celebrities image`, you would use the following command:

`docker run celebrities`

This will throw an error because the script expects an input. However, at present, this is impossible because the image is running in a non-interactive mode. As a solution, we must add the options, `-i` and `-t`: `-i` will keep the STDIN open, and `-t` will make the process interactive.

<p align=center> <img src=images/Docker_run_error.png width=600> </p>

`docker run -it celebrities`

<p align=center> <img src=images/Docker_run.png width=600> </p>

There other common Docker container operations you may need to use. These include:

- **Stopping a Container**: To stop a running container gracefully, you can use the `docker stop` command followed by the container's ID or name `docker stop <container_id_or_name>`. You can obtain the image ID using the `docker images` command.

- **Removing a Container**: If you want to remove a stopped container, you can use the `docker rm` command with the container's ID or name: `docker rm <container_id_or_name>`

- **Listing Running Containers**: To see a list of running containers, you can run the `docker ps` command

- **Listing All Containers**: To see a list of all containers (including stopped ones), you can use the `-a` flag with docker `ps`

## Hands-On: Pushing Images to Docker Hub

Now that we have successfully build the `celebrities` image in the previous hands-on, the image can be used everywhere, regardless of the OS and dependencies installed. Additionally, you can distribute it globally using Docker Hub. To do this, you need to log in to your Docker Hub account using the CLI:

```
docker login
```

You will be prompted to enter your Docker Hub username and password. After entering your credentials, you should see a successful login message.

> Before pushing an image to Docker Hub, you need to tag it with the appropriate repository name and optionally specify a tag. The repository name typically follows the format `<username>/<image_name>`.

Use the `docker tag` command to add the repository name and tag to your image: `docker tag <image_id> <username>/<image_name>:<tag>`.

- `<image_id>`: The image ID of your existing Docker image. You can find this out by running `docker images`.
- `<username>/<image_name>`: The repository name on Docker Hub where you want to push the image
- `<tag>` (optional): A specific tag for versioning your image (e.g., v1.0). If not specified, it defaults to latest

Let's tag our previously created image:

```
docker tag 82a51cbd4876 ivanyingx/celebrities:v1
```
Above, `ivanyingx` is the username, which you should replace with yours, and `82a51cbd4876` is the image ID. Afterwards, confirm that the image has been properly built by running `docker images` once more.

<p align=center> <img src=images/Docker_tag.png width=800 height=100> </p>

Incidentally, it is also possible to confirm this information in Docker Desktop if you are on Mac or Windows:

<p align=center> <img src=images/Docker_Desktop.png width=900 height=600> </p>

With the image tagged, you can now push it to Docker Hub using the `docker push` command: `docker push <username>/<image_name>:<tag>`. Which for this example will be: `docker push ivanyingx/celebrities:v1`.

To verify that your image has been uploaded, go to your Docker Hub account and check you can see the newly pushed image

<p align=center> <img src=images/Docker_Hub_example.png width=1000 height=400> </p>

If any other Docker Hub user wants to run your container, they can do so directly on their local machine. For example, in this case, you can run `ivanyingx`'s image as follows:

run `docker pull ivanyingx/celebrities` to download the image

and `docker run ivanyingx/celebrities` to run the image.

It is also possible to run `docker run ivanyingx/celebrities` directly, which will perform both operations.

Congratulations! You've successfully built and pushed your first Docker image.

## Key Takeaways

- Docker is a containerization platform that simplifies software development, deployment, and scaling by packaging applications and their dependencies into containers
- Containers provide lightweight and isolated environments for running applications consistently across different systems
- Docker components include images, containers, registries, and volumes
- Images are read-only templates for containers, while containers are runnable instances created from images
- Registries are repositories for Docker images, and Docker Hub is a popular public registry
- Volumes provide persistent data storage for containers
- Dockerfiles are text-based configuration files that define how Docker images should be built
- Dockerfile structure includes base image selection, instructions for setting up the environment, copying files, running commands, exposing ports, and defining the default command