#Chapter 1: What is Docker?

##Containers
Portable computing environment which contains:
<ul>
  <li>Code</li>
  <li>Dependencies</li>
  <li>Configuration</li>
</ul>

Benefits of containers:
<ul>
  <li>reproducibility: run the same now and in the future</li>
  <li>portability: run the same on different machines</li>
  <li>security: limited access to pc. isolated from other containers</li>
  <li>lightweight: use few resources</li>
</ul>

##Docker Engine

Create, run, and manage containers.<br>
This course will not cover:
<ul>
  <li>docker Compose - a system for running multi-container docker apps</li>
  <li>kubernetes: system for docker scheduling and management</li>
</ul>

Docker Engine comprised of:
<ul>
  <li>Docker client: CLI used to talk to the server </li>
  <li>API</li>
  <li>Docker daemon</li>
</ul>

Docker client is a command line interface (CLI) used to talk to the Docker server. The server (aka daemon) is a background process that requires no human interaction. API spec defines how the user as well as other software can intract with the daemon. The daemon manages all Docker objects (containers, images, etc.)

The image is a blueprint for a container with all software. A continer is a running image.

The process is only given access to a single folder in the host machine.

Container runs its own OS different and separate from host OS and other containers.




##Container vs Virtual Machine

Both containers adn VMs are virtualization technologies.

Virtualization: resources (ram, cpu, etc.) can be split to look like separate resources.<br>
Those resources can be allocated to separate processes so no overlap occurs.

Containers
<ul>
  <li>virtualization occur in a software layer above the OS-level</li>
  <li>possibile for hackers to get access to host OS/other containers</li>
  <li>Much smaller size on disk compared to VMs</li>
  <li>Only need part of OS as part of host OS is shared</li>
  <li>Faster to start, stop, distribute, and update</li>
  <li>Only support CLI apps</li>
</ul>

Virtual Machines
<ul>
  <li>virtualize the entire host machine down to the hardware</li>
  <li>Higher security compared to containers</li>
  <li>Much larger size on disk</li>
  <li>Only need part of OS as part of host OS is shared</li>
  <li>Supports GUI and CLI apps</li>
</ul>

#Chapter 2: Using Docker Containers



##Commands

```
nano <file-name> #open <file-name> in nano text editor
touch <file-name> #create empty file with given name
echo "<text>" #prints <text> to the console
<command> >> <file> # pushes output of <command> to the end of <file>
<command> -y # auto respond yes to all prompts from <command>
```

Every docker command begins with 'docker'

```
#Start the specified image
docker run <image-name>

#Start specified image with a interactive shell
docker run -it <image-name>

#Run a (detached) container in background without printing to shell
docker run -d <image-name>

#Exit a docker container
docker exit

#list running images with additional information
docker ps

#stop specified docker image
docker stop <container-id> or <container-name>
```

NOTE: stopping a docker container does not remove it from disk space. mMst use 'rm' command (below)

##More advanced commands

```
#Set a name for given container
docker run --name <container-name> <image-name>

#find a running container by name
docker ps -f "name=<container-name>"

#Find container output/log
docker logs <container-id>

#Use -f flag to follow a container's log
dock logs -f <container-id>

#Remove a stopped container
docker container rm <container-id>
```

##Managing local project images

Docker Hub contains many pre-made docker images.

```
#pull docker image from docker site
docker pull <image-name>:<image-version>

#pull image from other site
dock pull <url-to-image>
```

If image version isn't specified, latest version will be pulled.

```
#List all available local images with info
docker images

#remove images (must not have any running containers of image)
docker image rm <image-name>

#remove all stopped containers
docker container prune

#remove all unused images
docker image prune -a
```
Dangling image is an image that no longer has a name (e.g. name has been reused)




##Distributing Docker images

Private docker registries: Not maintained by Docker - no guarantee of quality.

```
#push docker image to registry
docker image push <image-name>

#Image name must start w/ registry name. Rename with
docker tag <iamge-name> <new-name-with-url>

#Login to registry
docker login <url>

#send docker image as file (to specific people instead of a registry)
docker save -o <image-name.tar> <image-name>

#load an image from a file
docker load -i <image-name.tar>
```



#Chapter 3: Writing Docker images

Dockerfile: textfile with instructions to create an image<br>
Docker Image: blueprint to create a container.<br>
Docker Container: a running image.

Dockerfiles always start from another image

FROM ubuntu:22.04

```
#Build docker image ffrom dockerfile. Use '.' if file in current directory
docker build <path-to-dockerfile>

#Can specify a name and version like below
docker build -t <image-name>:<image-version> <path-to-dockerfile>
```

##Customizing images

From here can put usual shell commands to install specific software/packages
in the dockerfile.

```
FROM ubuntu
RUN apt-get update
RUN apt-get install -y python3
```

##Managing files in an image

```
#Copy file from source/host to dest/image. If filename not specified, entire folder copied
COPY <src-path-on-host> <dest-path-on-image>

#Can also download files. Best to do following steps
#Download the file
RUN curl <file-url> -o <destination>

#Unzip the file
RUN unzip <dest-folder>/<filename>.zip

#Remove original zip file
RUN rm <copy_directory>/<filename>.zip
```
Each instruction adds to image size.<br>
Best practice to combine these into single run instruction:

```
RUN curl <file-download-url> -o <destination-dir>/<filename>.zip \
&& unzip <destination-dir>/<filename>.zip -d <unzipped-directory> \
&& rm <destination_dir>/<filename>.zip
```

NOTE: docker can't access files from parent directory.

##Choosing a start command for a Docker image

Can choose any shell command to execute when container is started.<br>
CMD does not add to size of dockerfile. Only last CMD command will be run.

```
#run a python script
CMD python3 my_pipeline.py

#start a database
CMD postgres
```

Typically run a script which starts other applications

```
CMD start.sh
```

##Overriding the start command

```
docker run <image> <shell-command>
```

Usually run image in interactive mode when overriding start command.

```
docker run -it <image> <shell-command>

#example with ubuntu
docker run -i ubuntu bash
```

##Caching and Docker layers

During a build the image tracks which command make which changes to which files.
Can view an image as a consecutive list of changes to the filesystem.

<b>Docker layer</b>: All changes caused by a single Dockerfile instuction.<br>

Docker caches each of the changes made, and reuses layers which haven't changes from a previous build. Docker only reuses these if the current and all previous instructions haven't changed from the previous build.

e.g. Docker won't know when a new version of python 3 has been released.<br>
It will keep using the cached apt-get update/install python3 command.<br>

Typically want to structure a Dockerfile's command in order from least to msot changing. This way as many cached commands will run as possible, saving time.

#Chapter 4:Creating Secure Docker Images

##Changing users and working directory

WORKDIR: changes working directory in which subsequent commands are executed<br>
USER: changes which user is executing the subsequent instructions<br>

Best practice to use root user only to create other users and give them user-specific permissions. Then stop using root user.

The last user instruction in the Dockerfile also sets the user for anyone running a container from the image.

##Variables in Dockerfiles

Less verbose, easier to change/update
ARG changes behavior of Dockerfiles during the build.<br>
ENV changes behavior at runtime. Not possible to override ENV at build time.<br>
Can override ENV vars when starting a container from an image.
```
docker run --env <key>=<value> <image-name>
```

```
ARG <var_name>=<var_value>
ARG path=/home/repl

#Use $ to access variable
$path
```

The defined variables are only accessible in the Dockerfile.<br>
Typical uses are python versions or filepaths.<br>
Can overwrite a variable in the build command

```
docker build --build-arg project_folder=/repl/pipeline .
```

Can also create variables with ENV instruction. This allows changes at runtime

```
ENV <var_name>=<var_value>

#Example
ENV MODE production
```

These are not secret. Anyone can look at vars in a Dockerfile after image is built using
``` docker history ```

Commands run at build/start time can be found in the bash history of the host or image.

##Creating secure Docker images

Following steps can make containers more secure:
<ul>
  <li>Start from a trusted image.</li>
  <li>Update software to latest version in images manually as images on Docker</li>
  <ul>
    <li>not always completely up-to-date</li>
  </ul>
  <li>Keep images minimal - Install only essential packages</li>
  <li>Make containers start as user with only specific permissions</li>
</ul>

##Where to go from here

<ul>
  <li>More Dockerfile instructions</li>
    <ul>
      <li>ENTRYPOINT</li>
      <li>HEALTHCHECK</li>
      <li>EXPOSE</li>
    </ul>
  <li>Build image from scratch</li>
  <li>Multi-stage builds</li>
  <li>Networking: connect containers to a network</li>
  <li>Volumes: access local/saved files in a new way</li>
  <li>Docker Compose</li>
  <li>Kubernetes</li>
</ul>