# `learning-docker` using Docker Desktop in Windows

## Check if `docker` is installed properly

In [None]:
!docker info

## The Docker flow

In [None]:
!docker images

An image can be referred to by using `<REPOSITORY:TAG>` or `<IMAGE ID>`.

![](what-is-docker.png)

The following is a useful `bash` export that formats the outputs from `docker ps` in a more readable manner.
```bash

export FORMAT="\nID\t{{.ID}}\nIMAGE\t{{.Image}}\nCOMMAND\t{{.Command}}\nCREATED\t{{.RunningFor}}\nSTATUS\t{{.Status}}\nPORTS\t{{.Ports}}\nNAMES\t{{.Names}}\n"

# this shell will only work on bash
```

A docker image has just enough of the operating system to do what is needed for code to be ran. `docker images` shows all of the images that have been created. `docker rmi <IMAGE ID>` will remove the installed image.

The `docker run` command takes an image and turns in into a container with a running process in it.

In [None]:
!docker run -ti ubuntu:latest bash

`-ti` is terminal interactive and is used to have a full terminal within the container so that you can run the shell and get things like tab completion and formatting to work correctly.

In [None]:
!docker ps -l

Run `docker ps` to get a list of running docker images and run `docker ps -l` to get the latest image that was ran. A running container's ID and an image ID are different.

Containers that are running are unique and independent of other containers that are running based on the same image. If one creates a container and adds a file to it, another operating container would not be able to see the new image. 

For a web-based application, you can have one container that holds the MongoDB database, another container that holds the React front-end, and the hosting server in the final container. Volumes contain the data for the containers. Container networking allows them to communicate with each other.

To see all containers (including stopped containers), run `docker ps -a` and `docker ps -l` shows the last container to exit (mentioned above).

The `docker commit` command takes containers and makes images out of them. `docker run` and `docker commit` are complementary to each other.

The `latest` tag is optional in docker. `docker run -ti ubuntu bash` is the same to `docker run -ti ubuntu:latest bash`.

In [None]:
!docker ps -l

In [None]:
!docker commit 11d30f7d8158

We now have a new image. The original is unchanged. We now need to tag the image.

In [None]:
!docker tag dfca65088a629cb623abb71717cee366a52083eab2e0a03a75d4cce09e425b59 my-image

In [None]:
!docker images

We can skip `docker tag` by doing `docker commit stupefied_poitras my-image-2`.

In [None]:
!docker commit stupefied_poitras my-image-2

In [None]:
!docker images

## Running processes in containers

`docker run --rm -d`
- Containers have a main process
- The container stops when the process stops
- Containers have names
- `--rm` if you want to run something in a container but don't want to keep it after. Same as `docker rm <container name>`.
- `-d` for detached. It starts the container and leaves it running in the background. Run `docker attach <container name>` to connect to the running container. `CTRL+P` + `CTRL-Q` to detach in a running container, but leaves it running.

Example `docker run` commands:
- `docker run --rm -ti ubuntu sleep 5`
- `docker run -ti ubuntu bash -c "sleep 3; echo all done"`

If you've started a container and it already has something running, `docker exec` adds another process to a running container. It's great for debugging and DB administration, but you can't add ports, volumes, etc.

`docker exec -ti <already running container name> bash`

## Managing containers

`docker logs` (Docker keeps the logs of the container around as long as you keep the container around)
- view the output of containers
- `docker logs <container name>`

In [None]:
!docker run --name example -d ubuntu bash -c "lose /etc/password" # should be "less /etc/password"

In [None]:
!docker logs example

Killing and removing containers
- `docker kill <container name>` (makes it stop)
- `docker rm <container name>` (makes it be gone)

Resource Contraints
- Memory limits: `docker run --memory <max. allowed memory> <image name> <command>`
- CPU limits: 
  - `docker run --cpu-shares <relative to other containers>`
  - `docker run --cpu-quota <general limitations>` 

Lessons learned
- Don't let containers fetch their dependencies when they start. Make containers include their dependencies inside their containers themselves.
- Don't leave important things in unnamed stopped containers.

## Exposing ports

Docker is generally used to run web and database servers.

- Programs in containers are isolated from the internet by default
- You can group your containers into private networks
- It's explicitly chosen who can connect to whom

A basic example:

- The server: `docker run --rm -ti -p 45678:45678 -p 45679:45679 --name echo-server ubuntu:14.04 bash`
 - root@hostname: `nc -lp 45678 | nc -lp 45679`
- Client 1 (other terminal): `nc localhost 45678`
 - `let's go!`
- Client 2 (other terminal): `nc localhost 45678`

Note: use `nc <Mac: host.docker.internal, Windows: IP> 45678` if running from within a container.

Docker allows for dynamically selecting available ports for scalability. For instance, when running `docker run --rm -ti -p 45678 -p 45679 --name echo-server ubuntu:14.04 bash` without the external port specified, run `docker port echo-server` and see the selected ones.

To see the existing networks: `docker network ls`.

In [1]:
!docker network ls

NETWORK ID          NAME                DRIVER              SCOPE
58208bb80eab        bridge              bridge              local
fd33bc2852fd        host                host                local
fc5d38d93ef8        minikube            bridge              local
3ac6b9286ea3        none                null                local


`bridge` is used by containers that don't specify a preference to be put into any other network. `host` is when you want a container to not have any network isolation at all (this does have some security concerns). `none` is when a container should have no networking.

- To create a new network: `docker network create learning`
- To put machines on the network: `docker run --rm -ti --net learning --name catserver ubuntu:14.04 bash`
 - `ping catserver`
- To put machines on the network: `docker run --rm -ti --net learning --name dogserver ubuntu:14.04 bash`
 - `ping catserver`

__NOTE__: Legacy docker linking allows only one-way connections. Something like `docker --link catserver --name dogserver`. Legacy ljnking only operates in one direction, and a major drawback is that environment variables get copied over to the container that executes the link.

## Managing images

`docker images` lists downloaded images. It doesn't list images that are able to get be downloaded. Images that share underlying data don't repeat the data itself; docker is space efficient.

In [None]:
!docker ps -l 

In [None]:
!docker commit f233f2ac34d6 my-image-14:1.0.0

In [None]:
!docker images

A common naming convention of tagging images: `registry.example.com:port/organization/image-name:version-tag`

Images are grabbed by using `docker pull`, but this is done automatically by `docker run`. The opposite of `docker pull` is `docker push`. `docker rmi image:tag` removes an image from the system.

## Volumes

Volumes are virtual "discs" to store and share data from container-container (`volumes-from`) and container-host (`-v`). Volumes are not part of images.
Two main varieties:
- __Presistent__: the data is available on the host, and when the container goes away, the data will still be there (container-container)
- __Ephemeral__: exist as long as the container is using them; when no container is using them, they evaporate (container-host)

Volumes are not a part of images.

- Sharing a folder from the host to the container: `docker run -ti -v /Users/aurthurulfeldt/example:/shared-folder ubuntu bash` -- in this case, despite leaving the container, the files created in `/shared-folder` within the container will appear in `/Users/aurthurulfeldt/example`
- Sharing data between containers: 
 - `docker run -ti -v /shared-data ubuntu bash`
  - `echo data > /shared-data/data-file`
 - `docker run -ti -volumes-from <container-name> ubuntu bash`

## Registries
- Registries manage and distribute images
- Docker (the company) offers these for free
- One can run their own, as well

In [None]:
!docker search ubuntu

`docker hub` for containers is extremely similar to `pip` and `git` for software.

## Building Docker images

A `Dockerfile` is a small program to create an image. It is ran with `docker build -t <name-of-result> .`. Each line in a Dockerfile is its own call to `docker run`. Put the most volatile steps at the end of the file since each step is cached to avoid running steps too often. Read the Docker file reference [here](https://docs.docker.com/engine/reference/builder/).

In [None]:
%%file Dockerfile

FROM busybox
RUN echo "building simple docker image."
CMD echo "hello, world."

In [None]:
!docker build -t hello .

In [None]:
!docker run --rm hello

In [None]:
%%file Dockerfile

FROM debian:sid
RUN apt-get -y update 
RUN apt-get install nano
CMD ["/bin/nano", "/tmp/notes"]

In [None]:
%%file Dockerfile

FROM example/nanoer
ADD notes.txt /notes.txt
CMD ["nano", "/notes.txt"]

An example of the multi-project docker file:

In [None]:
%%file Dockerfile

FROM ubuntu:16.04 as builder
RUN apt-get update
RUN apt-get -y install curl
RUN curl https://google.com | wc -c > google-size
    
FROM alpine
COPY --from=builder /google-size /google-size
ENTRYPOINT echo google is this big;cat google-size

*.dockerignore* files can help similarly to *.gitignore*.*

An example *.dockerignore* file
```
node_modules
npm-debug.log
```

An additional *Dockerfile* example
```
FROM node
WORKDIR /usr/src/app
COPY package*.json ./
RUN npm install
COPY . .
EXPOSE 4000
CMD [ "npm", "start"]
```

### Under the hood

A kernel runs directly on the hardware and does the following jobs:
- It receives messages from hardware (I/O -- new CDs)
- Starts and schedules programs
- Controls and organizes storage
- Passes messages between programs
- Allocates resources, memory, CPU, network, etc.
- Creates containers by Docker configuring the kernel (Docker is a program that manages the kernel)
 - Docker is written in Go
 - Manages kernel features (uses cgroups to contain processes, namespaces to contain networks, copy-on-write filesystems to build images)
- Everyone was doing what Docker does already -- Docker makes scripting distributed systems easy

#### Controlling docker through a socket

`docker run -ti --rm -v /var/run/docker.sock:/var/run/docker.sock docker sh`

Starting another container from a client within a container: `docker run -ti --rm ubuntu bash`. This is a client within a Docker container controlling a server that's outside that container.

### Networking in brief
- Ethernet: moves "frames" on a wire (or Wi-Fi)
- IP Layer: moves packets on a local network
- Routing: forwards packets between networks
- Ports: address particular programs on a computer

#### Bridges
- Docker uses bridges to create virtual networks in your computer (function 
- They function like software switches
- They control the ethernet layer

`docker run -ti --rm --net=host ubuntu:16.04 bash` where `--net=host` gives it full access to the host's networking stack. 

within the container: `apt-get update && apt-get install bridge-utils` then `brctl show` and `docker network create my-new-network`

#### Routing
- Creates firewall rules to move packets between networks
- NAT (network address translation)
- Change the source address on the way out
- Change the destination address on the way back in
- `sudo iptables -n -L -t nat`

`docker run -ti --rm --net=host --priviledged=true ubuntu bash`

within the container: `apt-get update && apt-get install iptables` then `iptables -n -L -t nat` and `docker run -ti --rm -p 8080:8080 ubuntu bash`

This shows that exposing a port is really port forwarding.

#### Namespaces
- A feature in the Linux kernel that allows complete network isolation to different processes in the system
- They allow processes to be attached to private network segments
- These private networks are bridged into a shared network with the rest of the containers
- Containers have virtual network cards
- Containers get their own copy of the networking stack

#### Processes and cgroups
- One of docker's jobs is to manage processes in containers (keep isolated or communicating)
- `docker run -ti --rm --name hello ubuntu bash` to find out the name of the root process in this container `docker inspect --format '{{.State.Pid}}' hello`

In [None]:
!docker run -p 4000:4000 <IMAGE NAME> # Runs the node image example above.

In [None]:
!docker stop <CONTAINER ID> # Stops the running container.

In [None]:
!docker start <CONTAINER ID> # Re-starts the stopped container.

`docker-compose` allows one to manage multiple containers with a single file. It lets docker know which services we want to compose. It's essentially a way to replace the `docker run` commands with a single file.

In [None]:
%%file docker-compose.yml

app:
  container_name: app
  restart: always
  build: .
  ports:
    - "4000:4000"
  links:
    - mongo
mongo:
  container_name: mongo
  image: mongo
  expose:
    - "27017"
  volumes:
    - ./data:/data/db
  ports:
    - "27017:27017"

Run `docker-compose build` for the file above. Then run `docker-compose up -d mongo` since we want the mongo container to run first. To make sure that it's running, run `docker logs <CONTAINER ID>`. Then run `docker-compose up -d app` to run the main app.

Docker can be integrated with a continuous integration framework. One can use something like Travis CI using the following file.

```yaml
sudo: required
services:
  - docker

script:
  - docker build -t <dockeruser/nameofproject> .
  - docker images <dockeruser/nameofproject>

before_deploy:
  - docker login -u <username>  -p <userpassword>

deploy:
  provider: script
  script: docker push <dockeruser/nameofproject>
  on:
    branch: master
```

## Kubernetes

- Containers run programs
- Pods group containers together
- Services make pods available to others
- Labels are used for very advanced service discovery
- `kubectl` makes scripting large operations possible (ex. `kubectl get services -o wide`)
- Very flexible overlay networking
- Runs equally well on your hardware or a cloud provider
- Built-in service discovery
- `EC2 Container Service (ECS)` is another possible orchestration system