# Docker

__SUMMARY__
> docker system prune

> docker system df  ----> disk usage summary

> docker system info  ----> complete info

> docker rmi

> docker rm

> docker volume rm

### Purging all dangling images, container, volumes and networks

`docker system prune` _Prune means to remove unwanted stuff_

`docker system prune -a` _To remove all_

> Attention: prune will not remove the running containers. Use `docker info` to get detailed info about containers, volumes, images and other specs

## ----------------------------------- CONTAINERS -----------------------------------

### List all containers
`docker ps -a`

`docker ps -aq` [q: only ids]

### Stop all running containers
`docker stop $(docker ps -aq)`

### Remove all containers
`docker rm $(docker ps -aq)`

### Run and remove
`docker run -rm image_name`

### Remove only exited containers
`docker rm $(docker ps -aq -f status=exited)`

### Remove containers according to pattern
`docker ps -a | grep "pattern" | awk '{print $1}' | xargs docker rm`
> awk and xargs to supply the ID to docker rm

## ----------------------------------- IMAGES -----------------------------------

### List images
`docker images` or `docker image ls`

### Remove all images
`docker rmi $(docker images -aq)` _q: for id_

### List dangling images
`docker images -f dangling=true`

### Remove dangline images
`docker image prune`

### Remove images according to pattern
`docker images -a | grep "pattern" | awk '{print $3}' | xargs docker rmi`


## ----------------------------------- VOLUMES -----------------------------------

### List volumes
`docker volume ls`

### Remove volume
`docker volume rm vol_name`

### List dangling volumes
`docker volume ls -f dangling=true`

### Remove dangling volumes

Removes unused volumes i.e volumes not used by any container

`docker volume prune <--force>`

> With --force, it will not ask for confirmation

> Even if container stops, docker will conside the volume to be in use

## Remove container and its volume
`docker rm -v container_name`

Summary of docker volume subcommands
```
docker volume create
docker volume ls
docker volume inspect
docker volume rm
docker volume prune
```

As per the recommendations, I should keep two things in mind on _DATA PERSISTENCE_:

1. Use volume instead of bind mount or tmpfs (temporary file storage: stores data in RAM)
2. Use `--mount source=<name of volume created>,target=<absolute destination in container>`
3. We can use bind mount when our data is in some other directory (instead of /var/lib/docker) and we want to mount that on docker container

-----------


### To run bash on docker image
```bash
# Assuming an Ubuntu Docker image
$ docker run -it --name <container_name> <image> /bin/bash
```

> This won't work if your image has a defined ENTRYPOINT. For these cases use: `docker run -it --entrypoint /bin/bash <image>`

### To go inside container running in detached state
`docker exec -it cc73eb6d6f75 bash`

------------

## Theory on Docker and Docker Layer Caching

This is where the paradigm shift comes into place: software is no longer packaged as a platform-specific binary artifact (jar, dll, tgz) but as a fully fledged virtual environment in the form of Docker images. This means that developers can run the code locally exactly as it is run on dev, test or prod environments. The operations teams have only Docker images to deal with, with much less need to understand the inner workings of the specific platform they are deploying.

Every time you change or update the application code, you need to build a new version of the image that can be used for deployment. Even though you’d typically only change the application code, the entire image needs to be built from scratch — including all dependencies. To help with the efficiency of the typical development process and shorten the feedback loop cycle, Docker has introduced the concept of layer caching.

## Theory on Docker Compose and Buildkit

We can use `docker build` and `docker run` to build and test/run our applications. But if we have multiple docker files for e.g for client app and one for server then to switch back and forth and build and run these can be cumbersome. To solve this, we have `docker-compose` that can build and run with just one command.

With the help of Buildkit (still experimental) we can use caching layer service of docker build into docker-compose. When we use docker build and then docker compose the docker compose builds the dockerfile from scratch and doesn't use the cache of docker build that we ran first time. With the help of buildkit command docker-compose will re-use the cache layer.

Note: when we use builtkit for the first time, it'll create it's own layer storage strategy. This means it'll build the docker image from scratch but then after that when we use docker-compose, it will use the buildkit cache storage and would not build from scratch.

Brief introduction [5mins read]: https://medium.com/better-programming/sharing-cached-layer-between-docker-and-docker-compose-builds-c2e5f751cee4


# Docker Contd.

## Share data between docker container and host

## Bind mount volumes

Docker containers are emphemeral meaning any data created inside the container is only available in that container and only while the container is running.

__Scenario__:

Let's say we want to run nginx container and keep the permanent copy of log files generated during it's run for later analysis. Nginx log files are generated at /var/log/nginx by default and it is not accessible from host system.

__Step 1: Bindmount Volume__:

`docker run --name=nginx -d -v ~/nginx_logs:/var/logs/nginx -p 5000:80 nginx`

* `--name=nginx` names the container so that we can refer it easily
* `-d` run the container in detached state i.e in background so that we can have access to the terminal from where we are running the docker run command
* `-v ~/nginx_logs:/var/logs/nginx` bindmounts volume that links `/var/logs/nginx` directory from inside the container to `~/nginx_logs` directory of the host. Docker uses __`:`__ to split host path with container path and __host path always comes first__. 
* `-p 5000:80` __port forwarding__. This flag maps container's port 80 to port 5000 of the host machine.
* `nginx` name of the image

> The -v flag is very flexible. It can bindmount or name a volume with just a slight adjustment in syntax. If the first argument begins with a / or ~/, you’re creating a bindmount. Remove that, and you’re naming the volume.
For more details: https://www.digitalocean.com/community/tutorials/how-to-share-data-between-docker-containers

__Step 2: Access Data on Host__:
Just go to the directory `~/nginx_logs` and see the log file.
> If you make any changes to the `~/nginx_logs` folder, you’ll be able to see them from inside the Docker container in real time as well.

> Multiple containers can share the same bind mount.

## Named Volumes

- They are more recent ways of creating volumes and they exist outside the container lifecycle.
- Named volumes support more feature thatn bindmount such as remote cloud storage

```docker
docker create volume my_volume
docker run -v my_volume:/directory/in/container ...
```

## Docker Memory

Docker for Windows and Mac comes with gui where we can increase the maximum memory allocated to Docker. This is because the docker engine runs on top of VM that allocated the default memory to Docker. 

In case of Linux, docker has the whole machine for its use as it doesn't need to run on VM. So, if you have to assigne >5GB (let's say) to your container then you don't need to do anything for linux dockers. Instead in linux dockers you can limit how much memory a container can use with memory flag.

# Docker Syntax

EXAMPLE 1:
```python
FROM node:10.9.0

COPY . .

RUN npm install

EXPOSE 8080

CMD npm start

# EXPLANATION:

# FROM it's better to use version rather than latest as the latest will override the current working docker image and if the latest image has problem then it's tricky to roll back. Versioning gives more control

# COPY current directory to working directory in container. Here the working dir of container will inherit from node image.
# ADD is more advanced/powerful than COPY. For e.g it automatically extracts the archive and it has support for URLs. Use COPY by default to avoid any surprises unless you specifically need it.

# RUN will run the shell command in the current workind directory

# EXPOSE tell what port the container should be listening on. It's merely for documentation as it does not actually publish the port number on the host machine. To publish the ports and enable the client to connect to them, we need to use `-p` flag from the command line 

# CMD tells what command to run when someone starts the container. The above cmd tells docker to run node server for our application on start.
```

---

__WORKDIR__: By default the workdir is set to `/`
```
FROM ubuntu:latest
WORKDIR var
WORKDIR log/nginx
CMD pwd
```
---
__ADD__ VS __COPY__:

- ADD is more advanced than COPY in two ways - 1. It extracts tarball 2. Can use remote URL for download
- Docker recommends use of COPY unless it's a remote file.
---
__RUN__ has 2 forms:

`RUN <command> (known as the shell form)`

`RUN ["executable", "parameter 1", " parameter 2"] (known as the exec form)`

e.g of shell form

```RUN echo `uname -rv` > $HOME/kernel-info```

**Very important point to note about _RUN_ is that docker uses the command string/instruction of RUN for cache and not the actual contents of the RUN instruction.** For e.g conside the RUN instruction below

```
FROM ubuntu:16.04
RUN apt-get update
```
Docker will cache all layers, however
```
FROM ubuntu:18.04
RUN apt-get update
```
In this case, Docker reuses the cache of the previous image and, as a result, the image build can contain outdated packages.The cache for the RUN instructions can be invalidated by using the --no-cache flag

`docker build -t <image_name> -f <dockerfile_path` build_context_path

> if you don't provide -t flag docker will give random name; if you don't provide -f flag docker will consider dockerfile to be present in the current dir; build_context_path is usually "." but if your context is somewhere else then give the relative path from current dir

The docker build command builds Docker images from a Dockerfile and a “context”. A build’s context is the set of files located in the specified PATH or URL. The build process can refer to any of the files in the context. For example, your build can use a COPY instruction to reference a file in the context.

# Bash

1. find [where_to_find] -name [what_to_find]

e.g `find / -name *.whl`

---

2. To check exposed ports e.g when aws instance is launched and we want to check from terminal

`netstat -ltpn`

---

3. CHECKING MEMORY and DISK USAGE

`free -m` - available and used RAM and swap

`du -sh *` - disk usage for all files and folders (*) in current directory

`df -h` - disk space

## GetIntoDevOps Notes

### CI

- Emphasises automated testing to ensure the new code changes work as intended and does not break anything (fast feedback)
- Pipeline automation servers (like Jenkins) are used to implement automatic testing

### CD (Continuous Delivery)

- While CI is the act of merging code as fast as possible, CD is the act of shipping changes to production frequently, in small increments
- In practice, the code in main branch should be deployable to production at all times

### CD (Continuous Deployment)

- While Delivery makes sure everything in the main branch should be in deployable state, the actual automation of deployment without human intervention is part of this stage.

### A/B Testing

- Introducing two changes in the application and measuring which variant works better


# Journey Questions

1. Basic pipeline running for various repos
2. Use docker
3. Implement bazel caching
4. Implement docker caching
5. 

## VSCode

Code quality tools currently using:
1. Flake8
2. mypy
3. black
4. isort

For python dependency and management:

`pipenv`

All the above tools have been installed globally.

My overall json settings as of now
```json
{
    "python.linting.pylintEnabled": false,
    "python.linting.mypyEnabled": true,
    "python.linting.enabled": true,
    "python.linting.flake8Enabled": true,
    "editor.formatOnSave": true,
    "python.formatting.provider": "black",
    "python.formatting.blackArgs": [
        "--line-length=120"
    ],
    "python.linting.flake8Args": [
        "--max-line-length=120",
        "--ignore=E402",
    ],
    "[python]": {
        "editor.codeActionsOnSave": {
            "source.organizeImports": true
        }
    },
    "window.zoomLevel": 2,
    "files.insertFinalNewline": true,
}
```

If I have to exclude my global settings let's say for other projects then I'll go into that virtual environment and use that python interpreter (ctrl+shift+p and then select python interpreter corresponding to that env). Thus the vscode settings.json would be different for that environment.

## How to increase EBS Volume size without downtime

__Phase 1__
Increasing the size from console

1. Select Volume that you want to scale
2. Create snapshot (Optional)
3. Click on Modify Volume
4. Give the size
> let's say it was 30GB and you want to make it 50GB, then give 50GB in the entry field
5. Click on Modify

__Phase 2__
Now we extend the partition

1. Type `lsblk` to list the block devices

A Nitro based instance will show something like below
```bash
    $ lsblk

    NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
    nvme0n1 259:0 0 600G 0 disk
    └─nvme0n1p1 259:1 0 400G 0 part /
```

T2 based instance will show something like below
```bash


    $ lsblk

    NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
    xvda 202:0 0 600G 0 disk
    └─xvda1 202:1 0 400G 0 part /

```

2.1 __Extending Partition on Nitro based instance__

The root volume, /dev/nvme0n1, has a partition, /dev/nvme0n1p1. 
Notice that the size of the root volume reflects the new size but the size of the partition reflects the original size which must be extended before you can extend the file system.

To extend the partition on the root volume, use the following `growpart` command
```bash
$ sudo growpart /dev/nvme0n1 1
```

> Note that /dev/nvme0n1 is the name of the root volume and 1 is the partition number

2.2 __Extending Partition on T2 based instance__

The root volume, /dev/xvda, has a partition, /dev/xvda1. 
Notice that the size of the root volume reflects the new size but the size of the partition reflects the original size which must be extended before you can extend the file system.

To extend the partition on each volume, use the following growpart commands.
```bash
$ sudo growpart /dev/xvda 1
```

3. __Extend the file system itself__

Based on the type of instance use either
```bash
sudo resize2fs nvme0n1p1
```
or 
```bash
sudo resize2fs /dev/xvda1
```
> last argument is the partition name that you see when you use lsblk

You can now verify the size by `df -h`