### Docker for Application

#### Docker voulmes
* There are three types of docker volumes
  + tmpfs mount
    + used to store sensitive data
    + temporary storage
      + data stored in memory
  + named or anonymous volume
    + managed by docker by docker cli
    + `docker volume create [name of volume]`
    + `docker run --volume code-volume:/app`
      + if volume exists, it will be mounted, otherwise, a new volume is created
    + docker cli for named volume
      + docker volume ls
    + advantages
      + volume is a managed object
      + isolated from other host activity
      + easy to identify and backup
      + better performance when sing docker desktop
        + volume is stored in linux virtual machine rather than local system
    + disadvantage
      + owned by root user (must run docker as root user), which is not a good practice and should be avoided
  + bind mount directory inside docker to 
    + arbitary directory from host 
    + changes reflected on host
    + `docker run --volume /path/on/host:/path/in/container` (path must be absolut path)
    + not have to be root user
    + must have a docker container to run
    + must consider user and group id mismatch when using bind mounts
      + the user on local host has a specific user id and group id (uid 1000, gid: 1000)
      + for bind mounts, the user inside docker is root (uid:0, gid: 0)
        + when you create files or folders inside the docker, the corresponding files on local host can not be written/removed
          + these files/directories now belong to root
      + solution 1: create the corresponding uid and gid inside docker file
        + `RUN groupadd -r --gid 1000 user \
          && useradd -r --uid 1000 -g user user`
        + whe run docker then use
        `docker run --volume /src --user user myapp touch /src/created_in_container`
      + solution 2
        + similar to solution 1, instead of creating group and user id by RUN shell script with hard coded ids, using ARG
       
       ```yaml
        FROM debian
        
        ARG UID=1000
        ARG GID=1000
        RUN groupadd -r --gid &#36;   GID user \
         &&   useradd -r --uid &#36; UID -g user user
         
         ```
         
         In terminal, run
         `docker build --build-arg UID=1001 --build-arg GID=1001 -t myapp .`
       

#### A use case of using multi-stage builds
* with compiled programming languages
  + increased complexity
    + not trivial to code and hot reload with a bind mount as in interpretor languages
  + larger image size
    + development tools captured inside the container image, which is not needed after compilation to exe files
* solution: builder pattern
  + split the build step sequence from the run step sequence, wth separate Dockerfiles for each task 
  + we need to use bind mount to refer to the files/directories in the build stage when build the run docker
* multi-stage docker (put everything in the same docker with different stages using FROM)

    ```yaml
      FROM node:14 AS builder
      WORKDIR /deps
      COPY . .
      RUN npm install
      
      FROM gcr.io/distroless/nodejs
      COPY --from=builder /deps /app
      WORKDIR /app
      CMD ["server.js"]
      
    ```  
* for multi-stage docker, you can define to which stage the docker build command should execute and ignore stages after the specified stage
  + the target stage and predecessors are included in the build
  `docker build -t app-builder --target builder .`
  + samller image sizes attained by selective inclusion of content
* constructing a multi-stage Dockerfile
```yaml
# base stage
FROM golang:1.16 AS base
# lint stage
FROM base AS lint
COPY golangci-lint /go/bin/
WORKDIR /app
CMD["golangci-lint", "run"]
# build stage
FROM base as build
WORKDIR /app
COPY go.??? ./
RUN go mod download
COPY *.go ./
RUN go build -o mini .
# execution stage
FROM alpine:3
COPY --from=build /app/mini /
ENTRYPOINT ["./mini"]
```

* An image used only for linting can be built using the --target option
  + `docker build -t mini-lint:1.0 --target lint .`
  + the built docker image can then be used to lint the code using a bind mount        
  `docker run -it --rm -v $(pwd):/app mini-lint:1.0`  
    + CMD will be executed to lint the code in current folder of the host, which is mapped to /app inside docker
    $
* one problem of docker building process is that the process is linear
  + even though there is no dependency between lint and build, build stage will be on top of lint
  + buildkit can find out the dependency such as both build and lint are based on base, and there is no dependency between them
    + processes Dockerfile instructions and constructs a directed acyclic graph of depedencies
      + non-denpendency steps are build parallel
    +provides an optional extended Dockerfile instruction se for more advanced build features
    + buildkit is not the default build engine that is used when invoking a container image build
    + to enable buildkit, you can 
      + set the deamon.json file 
      ```json
        {
          <snip>
          "features": {"buildkit": true},
          <snip>
          }
       ```
       
       + for docker desktop, set this from settings
       + set DOCKER_BUILDKIT at any vlaue rather than 0 by
         `export DOCKER_BUILDKIT=1`
      

#### Oprimization of docker images
* anatomy of an image (using docker inspect cli)
  + image configuration object (meta data)    
    + command of docker run
    + port to expose
    + workingDir
  + RootFS (file system definition for derived containers)
    + content layer that make up the filesystem (Layers with an array of SHA)
  + the majority of docker instructions arefor adding metadata to image
    + what user to use to run a container
    + which command or program to execute
  + small portion of instructions to create content for the file system in the form of layers
    + COPY instruction (recommended if possible)
    + ADD instruction (can also apply to remote content copy)
    + RUN (executes commands to generate additional image content)
      + add utilities
      + install app's dependencies
      + create a user
    + whenever docker build execute COPY, ADD and RUN instructions, a new content layer is added to the image
* when a container is created, docker assembles the content in each layer by making a union of the layers in turn to present a homogeneous set of files and directories
  + the contents are read only
  + enable to build images from other images
  + to write to the docker, docker adds a final layer on top of layers on image
    + unique, temporary, writeable layer for every container instance
      + not part of the image and is removed when container is removed
      + if container need to create a new directory, it is created in this layer
      + to alter a content in the previous layer, it copy the content to the final layer (copy on write)
      + to delete a content, it is removed from the union of the content by a technique called whiteout

#### Consolidate RUN command together to reduce docker layers
* to install packages and yarn package management package, we first need to install utilities
  + ```shell
    RUN apt-get update
    RUN apt-get install -y \
         curl              \  # needed to retrieve pgp keys
         ca-certificates   \  # to make curl wrork securely
         gnupg                # add retrieve keys to keyring
    ``` 
  + these content are temporary content. We don't need them after the service
    + it makes the image larger than it needs to be 
    + we want to build lean, secure, and efficient docker images
    + we need to find a way to remove it once it served its purpose
  + how can we remove this?
    + solution 1:
    + ```shell
    RUN apt-get update
    RUN apt-get install -y \
         curl              \  # needed to retrieve pgp keys
         ca-certificates   \  # to make curl wrork securely
         gnupg                # add retrieve keys to keyring
         
    RUN apt-get update
    
    RUN apt-get install -y  \
         nodejs             \
         yarn
         
    RUN apt-get purge -y curl ca-certificates gnugp  
    
    ``` 
   + if we do this, the temporary contents are still there and the image size will not be reduced, may be even bigger
     + the contents are not deleted from the previous layer. They are removed from the union of layer
     + you add extra content to remove the contents from union
   + what we can do is to consolidate all the run commands together into one layer  
   + ```shell
    RUN apt-get update && \
        apt-get install -y \
         curl              \  # needed to retrieve pgp keys
         ca-certificates   \  # to make curl wrork securely
         gnupg           && \ # add retrieve keys to keyring
         
       apt-get update    && \
    
       apt-get install -y  \
         nodejs             \
         yarn           && \
         
      apt-get purge -y curl ca-certificates gnugp  
      ```
    + disadvantage is that if we change any line of the code, the entire block will have to be re-executed 
      + we can not utilize cache when building the image     
      

#### using build cache
* docker uses a local cache of image build steps
  + careful placement of dockerfile instructions can maximize cache hits
  + docker build process:
    + each Dockerfile instruction processed druing a build results in an intermediary image as part of build cache
    + these images are created by committing containers created from the image associated with preceding instructions
    + images reference their parent image and create an implicit chain of images representing a sequence of instructions
  + docker build caches each imtermideiate layers, which will be checked for the later build processes
    + if a layer is found cached, it will directly use the cached intermediate image and build the following images
    + this will affect the build time
* the following instructions will invalidate the cache
  + instruction change
    + adding, removing, or altering an instruction
  + checksum check change  
    + content change in build context (copy o add to the build context)
  + command output will not be checked
    + consequences of command execution are not checked
  + original
  ```yaml
  WORKDIR /app
  COPY . .
  RUN yarn install
  ```
  + change to
  ```
  WORKDIR /app
  COPY package.json yarn.lock ./
  RUN yarn install
  COPY spec src ./
  ```
  + by this change, you don't have to copy package.json, yarn.lock and reinstall yarn each time when you change source code
    + the images before COPY spec src ./ is cached in your local host and can be directly used for docker building
  + tips:
    + analyze dependencies between Dockefile instructions to determine ordering constraints
    + order Dockerfile instructions according to the frequency of change
      + less frequent first, more frequent last
    + where it's beneficial, split COPY Dockerfile instructions that copy content from the build context  
  

#### utilizing multi-stage Dockerfiles to optimize size of image
* By separating consolidated command lines to separate RUN commands, we created multi-layers in docker image
  + this will increase the size of the image
  + but will improve the building process since most of commands corresponding cached layers
  + if this is not the final image in the production, the image size is not a concern
    + we sacrifice size for speed and efficiency
* we can use multi-stage Dockerfiles to control the final image size 
  + we are free to use multi-layer dockers as intermediate/previous stage images
    + this can fully take advantage of cached image     
  + just leave the temporary content resides in a previous stage
    + we don't need to remove/delete temporary content since we will not use them in multi-stage Dockerfiles
    
    

#### How docker handle different environments
* application configuration
  + everything that is likely to vary between deploys (staging, production, developer evinronments etc.)
  + principle of separating code and config
    + don't define configuration as constants in the applications's source code
    + store cofig in enviornment variables
      + easy to manage many instances running in different environments
      + no leakage of sensitive information hard-coded in software applications
      + straightforward onaboarding of new envrionments to host software apps
      + no need to re-test software apps due to chagnes to the configuration
      + config defined in environment variables are agnostic to languages and os
* define environment varaible in docker by env
  + ```yaml
    
    ENV REDIS_HOST "redis_server"  # not recommended
    ENV REDIS_HOST="redis_server" REDIS_PORT=6379  # recommended
    ENV REDIS_HOST="redis_server" \
        REDIS_PORT=6379
   
   ```    
* define environment variables in .env file and use it 
`docker run --rm --env-file file/path/to/redis.env`

* don't use environment variable to save sensitive inforamtion

* combine ARG and ENV in Dockerfile
  + ```ymal
    ARG NODE_ENV
    ENV NODE_ENV "&#36;{NODE_ENV:-production}"
   ``` 

#### Application logging configuration in docker
* logs are designed to report on events during execution
  + programmers code log writing within their source code for reporting
  + log messages are provided as output during program execution for debugging
* docker cpatures, and stores output written to the STDOUT and STDERR streams
  + therefore, write logs to the standout
  + in case a app such as nginx write logs to its specific log fiels, we use symbolic links to lead them to stdout
    + in Dockerfile, we run
    ```shell
    RUN ln -sf /dev/stdout /var/log/ngnix/access.log && \
          ln -sf /dev/stderr /var/log/nginx/error.log
    ```
* logging drivers (https://git.io/JOPzr)
  + docker has json-file logging drive as the default logging drive
    + store logs locally in JSON format by docker daemon
  + local
    + flexible and more performant file-based logging solution
  + journald
    + logs sent to journald service running on the Docker host
  + log-driver is configured in the file daemon.json
    + 
      ```json
    "log-driver": "local"
    "log-opts": {
        "max-size": "10m",
        "max-file": "6"
    }
}
```
  + we can also override system configuration for individual containers
    `docker run -it --name todo --log-dirver local --log-opt max-file=3 todo`
  + to find out which logging drive we use for a specific docker, we can use
    `docker inspect --format '{{.HostConfig.LogConfig.Type}' todo`
* to see logs, using
  `docker logs container_id/name`
  + other tags to customize output
    + --details
    + --follow (see logs real time)
    + --tail
    + --since (--since 5 m)
    + --until
    + --timestamps
* docker config example
  + check system log config for docker
  `docker info -f '{{.LoggingDriver}}'`
  + change it to local driver
    + find dawmon.json or access it from docker desktop         
    add a new line `"log-driver": "local"`
* when docker container is removed, logs are removed unless using journald logging driver
  `dufo journalctl -f CONTAINER_NAME=todo`