ArchitectureGuide

studying of docker, containerd, kubernetes, k3s, k8s, ELK stack etc

Docker

Dockerfile
- What is a dockerfile?
  - A dockerfile is a set of commands to configure and create a image.
    It is possible to upload and read images only by the docker hub and automate prerequisites by shell, but it is ideal and faster to create and load images with your servers specific needs.
  - docker build -t my-image-name .
    This command will create a image called my-image-name reading a Dockerfile in the path ..
- What are the components of a dockerfile?
  - FROM
    This keyword is used to import the base image of the dockerfile.
```
FROM python:3.10.13-alpine3.18
```
    The most common case of FROM is by the hubs repository:tagName.
    Tag names are created by the providers of the image and mostly contains of OS information and versions.
    ex : python3.10 installed in alpine3.18 OS.
  - WORKDIR
    This keyword specifies starting directory within the image.
    If the directory does not exists, it will create the specified directory.
    You can always change the directory multiple times in order to specify different locations for executions like COPY or RUN.
  - COPY, ENV
```
COPY /location-to-copy /in-image-location
```
    Copies the file in the scope of the dockerfile project to the images location.
```
ENV LOG_PATH=/etc/project/log
```
    Creates a environment variable in the image.
  - RUN vs CMD
```
FROM python:3.10.13-alpine3.18
WORKDIR /etc/project
COPY main.py

RUN mkdir external-files \
&& apk --no-cache update

RUN apk --no-cache add curl

CMD curl www.google.com \
python main.py
```
    Both RUN and CMD receives shell commands but has a fundamental difference.
    
    The RUN command is for creating the dockerfile.
    And the CMD command is for commands to run at execution.
    
    If you run docker build . with the dockerfile above, you will get this.
    
    Note that each separation of RUN is separated as image layers.
    You should divide them to reasonable layers in order to debug efficiently.
    
    If you run docker run to the created image, you will get a http response of google and run your python script.
  - Expose vs -p
```
FROM tomcat:10.1.17-jdk21-temurin-jammy
EXPOSE 8080/tcp
```
    This dockerfile creates a tomcat container and opens the default port used by tomcat, 8080.
```
docker build -t test-container .
docker run test-container
```
    If you run the docker ps command, you can see 8080/tcp in the PORTS section.
    
    But if you create a dockerfile without EXPOSE
```
FROM tomcat:10.1.17-jdk21-temurin-jammy
```
    and run docker with using the -p command
```
docker build -t test-container .
docker run -p 8888:8080/tcp test-container
```
    docker ps command, returns 0.0.0.0:8888->8080/tcp.
    What does this mean?
    When a docker port needs to be accessed with other docker containers, EXPOSE is sufficient.
    EXPOSE does not map the opened port with the hosts port.
    You should use it if the application only needs to be accessed by other containers within the scope.
    
    In the other hand docker run -p <hostPort>:<containerPort>/tcp opens and maps the port to be accessed by the host.
    Therefor the docker ps shows 0.0.0.0:8888 (the hosts port)->8080/tcp (mapped to the containers port).
    
    It would be best if we use this accordingly to our applications use case.
docker-compose
- What is docker-compose
  - Lets say that you want to create a service.
    You would want to create a database, a backend server, and a frontend server.
    Well, docker compose got you covered, you can deploy all of these servers with a single command, docker compose up.
    
    It is just like a composer in a orchestra, handling multiple pods just in one go.
- docker-compose.yaml
  - Basics
```
version: "3"
services:
  frontend:
    build: frontend_file/.
    ports:
    - "8081:8080"
  backend:
    build: backend_file/.
    ports:
    - "8082:3000"
  database:
    image: "mysql:8.2.0"
```
    This docker-compose.yaml means,
    1. Create 3 services named frontend, backend, database.
    2. The Dockerfile located for each service is in frontend_file/., backend_file/..
    3. For the database, use the image mysql:8.2.0 from docker hub.
    4. Connect the frontend pods port 8080 to 8081 of the host.
    5. Connect the backend pods port 3000 to 8082 of the host.
  - Volumes
    In docker, you can run docker run -v <host-path>:<container-path> <image-name> command in order use a path of the host as volumes to mount on your image.
    
    This is handy when you need to create multiple pods and shared volumes between services.
```
version: "3"
services:
  database:
  image: mysql:8.2.0
  volumes:
   - ./host-file-path:/var/lib/mysql
```
    The volumes is an array of <host_path>:<container_path> that you want to mount.
    After you mount a volume, changes either in the host or the container will affect each other as it is a "mounted" volume.
    
    try the docker exec -it <mycontainer> bash command, to double check.
    Some IDEs like jetbrains allow you to connect by UI.
  - Environment variables
    
    if you have ran the image: mysql:8.2.0 image above, you will see this error. This is because the environment variable for the mysql connection is not set.
```
version: "3"
services:
  my-service-name:
    image: my-image-name
  environment:
    - MYSQL_ROOT_PASSWORD=value1
    - MYSQL_ALLOW_EMPTY_PASSWORD=value2
    - MYSQL_RANDOM_ROOT_PASSWORD=value3
```
    You can also set environment variables in docker-compose.
    We did learn that ENV in Dockerfile also changes the environment variable.
    It would be a matter of taste where to put it, so just pick one, and do not cross use it.
  - depends_on
    This keyword is for a service to start only after the service that it depends on.
    The frontend should never start before the backend.
```
version: '3'

services:
  backend:
    image: backend-image-location

  frontend:
    image: frontend-image-location
    depends_on:
    - backend
```
  - Replicas
    Just note that this feature exists, but do not use this unless you are using docker swarm. Most companies do not use swarm over k8s.
```
version: "3"
  services:
    database:
      image: "mysql:8.2.0"
      ports:
      - "8080:3000"
      deploy:
        replicas: 3
```
    Also, you cannot bind 3 replicas ports with localhost port of 8080.
    So you should create 3 services separately or remove the host port.
    
    If you are thinking of creating a load balancer within docker-compose and redistribute traffic, I strongly advise you to just use kubernetes service and deployments.
    
    It is possible with nginx, but not recommended. kubernetes should handle replicas and traffic, not docker.

Kubernetes

Why do we need microservices? Mastering chaos - Netflix

From the year 2000, Netflix had a monolith architecture that scales horizontally.
They used two databases "STORE" and "BILLING" which had no replicas whatsoever.
As time goes on, their main server has become this massive chunk of code which was over their heads, dies constantly, takes all day to boot and debug.
- Massive chunk of code
- Massive dependencies
- Massive cost on boot and debugging
- High usage of threads and processes
- New features can kill the whole service
- A bad query can kill the whole service
- Massive cost updating databases and columns
That is where the microservice architecture comes in.
This architecture is where a service is divided into multiple parts.

Netflix
- user-api
- product-api
- platform-api
- Persistence (Databases)
- ...
This did solve the main problems of a monolith architecture, but it has its downfalls.
- Cascading failures
  When a feature accesses multiple api, a single point that fails will result in another point to fail and continues until it kills the whole system.
  
  In Netflix, they have solved it by a static response to be returned in case of a failure.
  This blocks and isolates the failed endpoint. and continues the service with failure in mind.
- Database Persistence
  If a service wants to change data to 3 different regions of databases, but cannot access to some of them, should the process fail? Should the process write to one, and apply the changes later?
  
  Now netflix used a nosql database called "Cassandra" which manages multiple database nodes and keeps persistence with each other nodes.
  
  This might be a overkill for most companies, but it's good to know a service that handles databases the "microservice" way.
- Stateful Service
  If a node keeps a "state", it means that it stores data within that node that is relevant to the service. The "user" must keep contacting this "service node" because it keeps it's "state" within.
  If a service is "stateful" the loss is a "cost" that results in a failure of a service.
  If a service is "stateless" the loss of a node has no impact on the service as the user can contact another node.
Kubernetes components

Kubernetes is a microservice management tool constructed with multiple components.
- Master nodes
  - API Server
    - Kubernetes dashboard
    - API
    - Kubectl
  - Control Manager
  - Scheduler
  - etcd
  - Deployment
    - Service
- Worker nodes
  - Pods
    - Containers
I will be giving examples on how to set up a kubernetes server without a cloud provider or tools.
If you want to manually configure kubernetes manually, you can use kubeadm, kpops or kubespray.
If you want to configure kubernetes in AWS, you can use EKS or EKSCTL.
In this guide, I will use kubeadm studied from the official documentation.
- Installation
  - Open ports
    
    In order for the Master node and worker nodes to function, you need these specific network rules.
    
    Now if you are using AWS, you can configure security groups for inbound and outbound rules.
    
    You can use netcat (nc) in order to check if the ports are open and connectable between nodes.
    For example, if you want to check if the port 10250 is open in your worker node,
    run nc -l 10250 in your worker,
    run nc -zv <worker-node-ec2-ip> 10250 in your master.
    
    your master netcat should return something like this.
  - Container runtime
    
    Now the kubernetes documents notes that you can run with a multiple of options like containerd, docker engine, CRI-O. I will just install docker with the simple sudo apt install command.
    Kubernetes did drop docker support for dockershim from its project and uses containerd, but docker uses containerd under the hood, so we shouldn't be worried about it.
    Install docker on ubuntu
```
# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

# Add the repository to Apt sources:
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
```
  - Installing kubeadm, kubelet and kubectl
    
    This installation is using the apt package install method, so if you are using redhat, or any other distributions, check instructions.
```
sudo apt-get update
# apt-transport-https may be a dummy package; if so, you can skip that package
sudo apt-get install -y apt-transport-https ca-certificates curl gpg
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
# This overwrites any existing configuration in /etc/apt/sources.list.d/kubernetes.list
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.28/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
```
  - Installing master node
    
    sudo kubeadm init --pod-network-cidr={your cidr ip}/{your cidr mask}
    
    --pod-network-cidr=172.31.0.0/16 this command is for configuring the cidr block for pods.
    --control-plane-endpoint specifies the endpoint (load balancer or IP) for the control plane. This is used in HA setups where the control plane is distributed across multiple nodes.
    --upload-certs is used to upload certificates to the Kubernetes configuration directory. This is essential for joining additional control plane nodes to the cluster.
    
    If have successfully installed the master node, this should pop up.
    
    You must configure the ./kube/config file afterward.
```
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/kubelet.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# restart kubelet in order to let the configuration apply
sudo systemctl restart kubelet.service
```
  - Installing worker node
    
    If you have successfully installed the master node, you will get a response like this.
    kubeadm join 172.31.14.79:6443 --token 8xwmlv.gs5787m8y16ty0zx \ --discovery-token-ca-cert-hash sha256:0ddf616b72ad2065e6312b17f03c0a77e090547a5e8190296141a5833a5d6a6c
    
    Run the command on the worker node. and you should get a response like this.
    
    If you return to your master node, and check kubectl get nodes,
    Congrats!
  - errors
  - unknown service runtime.v1.RuntimeService
    
    If you have any errors, run the kubeadm reset to reset changes made by kubeadm.
    Errors I have encountered
    kubernetes-sigs/cri-tools#1089
```
validate service connection: validate CRI v1 runtime API for endpoint "unix:///run/containerd/containerd.sock": rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService
```
    This error happens because kubeadm cannot communicate with containerd via a sock communication tool crictl.
```
sudo crictl -r unix:///run/containerd/containerd.sock ps
# expected result
CONTAINER           IMAGE               CREATED             STATE               NAME                ATTEMPT             POD ID              POD
```
    This command will change the /etc/containerd/config.toml so that it uses the default configuration.
    Change the disabled_plugins = ["ctl"] into disabled_plugins = []
```
sudo su
containerd config default | tee /etc/containerd/config.toml
sudo vim /etc/containerd/config.toml
systemctl restart containerd
```
  - Get "https://{server_ip_address}:6443/api?timeout=32s": dial tcp {server_ip_address}:6443: connect: connection refused
    
    If the {server_ip_address} is localhost, check the command kubectl config view command.
    This command would show that ~/.kube/config is not set.
    https://stackoverflow.com/questions/76841889/kubectl-error-memcache-go265-couldn-t-get-current-server-api-group-list-get
```
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/kubelet.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
```
    If your configurations are set, it means that you did not reboot kubernetes to set changed configurations.
```
# restart kubelet in order to let the configuration apply
sudo systemctl restart kubelet.service
```
  - Unable to connect to the server: tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
    
    This error happens because the configurations that is set on $HOME/.kube/config is having problems with certification.
```
user:
   client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem
   client-key: /var/lib/kubelet/pki/kubelet-client-current.pem
```
    The .pem files might not exist, if this happens on the control plane you can copy the admin.conf as the configurations.
```
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
```
- Service
  
  A service is network control configurations for pods.
  There are a various types you can assign as your service.
  - ClusterIp
  - NodePort
  - LoadBalancer
  - ExternalName
  This is a basic configuration of a service.yaml
  Lets go through the basics and see what each component do.
```
apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  type: NodePort
  selector:
    app: my-app
ports:
  - protocol: TCP
    port: 80
    targetPort: 8080
```
  spec.type
  - Note that the kind must be as Service and you can specify which service it is by spec.type.
    If you do not specify, it will be of type ClusterIp as default.
  spec.selector
  - The spec.selector you see above is app:my-app.
    This is how services know which pod is assigned to this service.
    You can assign selectors to a specific pod by deployment.yaml.
  - ```
  apiVersion: apps/v1
  kind: Deployment
  metadata:
    name: my-deployment
  spec:
    replicas: 3
    selector:
      matchLabels:
        app: my-app
    template:
      metadata:
        labels:
          app: my-app
      spec:
        containers:
          - name: my-container
            image: my-image
```
  This is the deployments example.
  You can see two instances of app:my-app.
  
  spec.selector.matchLables = app:my-app
  This means that the Deployment will identify the pod by app:my-app label.
  
  spec.template.metadata.labels = app:my-app
  This menas that the template for each pods will have the label app:my-app when created.
  
  So you would need 3 instances of app:my-app needed.
  Two in deployment.yaml and one in service.yaml.
  Note that all labels need to match if multiple is given.
ports.port && ports.targetPort
- ports.port is the port that the service opens to be accessed.
- ports.targetPort is the port that the service will access to the pods.
- You can get the internal(cluster-ip) and external-ip of the service and its open ports by the command
  kubectl get service <my-service-name>

Local Servers

Why would you want to escape AWS?
The tech industry is currently so tangles with cloud providers.
In a way that we cannot think or come up with a idea that does not use or rely on such services.
But in some cases, it is better to go locally and manage your own servers in order to cut cost. $400,000,000 Saved - NO MORE AWS
I am not in the position or experience to say such claims, but I do understand such points taken and give some cases.
- Using extreme computing hardware. (ex: AI).
- If your service relies on computing power itself as a source of income.
- If you are a student or a hobbyist sick of paying per month to AWS. (Me)
Why did I chose to migrate some of my projects out of AWS?
I have created a k8s cluster on aws trying to learn and deploy a microservice on my own.
But currently, the k8s requires a 2GB and a 2vCPU machine.
And k8s requires you to have 1 control plane, and 1 worker in order to get the full ordeal.
This would require me to get at least 2x t2.medium.
The monthly cost ramps up to 170$ per month if you consider all the additional cost for networking and rdb.
Which in fact, can be installed in a computer that you own in your house.

This is my bills by the way. XD
The cost return factor with custom bought computers.
Now, if you do plan to go local, you should be quite careful to chose the right hardware.
It is up to you because there are so many vendors, and so many choices to be made in such occasions.

But to be simple, I will compare the t2.medium with raspberry pi 5.

A t2.medium has a max of 3.3 Ghz and 4GB of memory.
A raspbarry pi 5 has 2.4 Ghz on default and 4GB of memory.

(at the time of this writing)
A t2.medium would cost you 42.5$ per month.
A raspberry pi would cost you 73$ for purchase.

If you run your local server for more than 2 months, you can get your ROI for the initial cost.
The same can be said for GPU intensive servers, NAS servers.
It rounds up to 2 months ~ 1 year to get back your initial investment in most cases.
It is not all roses and flowers
The sole reason that AWS has become dominate in the first place, is because it was such a hassle in order to manage a local server.
The ISP that you have may not be sufficient to handle such traffic as in AWS.
If you have such servers managed by you or your company, it can also be a huge risk for downtime.
Like the example of the kakao datacenter on fire.
Business critical information such as money transfer can be lost in such occasions.
You should think local servers as a choice that can be made, not as a belief.
Installing k8s in local servers

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
images		images
README.md		README.md

dohyung97022/ArchitectureGuide

Folders and files

Latest commit

History

Repository files navigation

ArchitectureGuide

Docker

Dockerfile

What is a dockerfile?

docker build -t my-image-name .

What are the components of a dockerfile?

FROM

WORKDIR

COPY, ENV

RUN vs CMD

Expose vs -p

What does this mean?

docker-compose

What is docker-compose

docker-compose.yaml

Basics

Volumes

Environment variables

depends_on

Replicas

Kubernetes

Why do we need microservices? Mastering chaos - Netflix

Massive chunk of code

Massive dependencies

Massive cost on boot and debugging

High usage of threads and processes

New features can kill the whole service

A bad query can kill the whole service

Massive cost updating databases and columns

Cascading failures

Database Persistence

Stateful Service

Kubernetes components

Installation

Open ports

Container runtime

Installing kubeadm, kubelet and kubectl

Installing master node

Installing worker node

errors

unknown service runtime.v1.RuntimeService

Get "https://{server_ip_address}:6443/api?timeout=32s": dial tcp {server_ip_address}:6443: connect: connection refused

Unable to connect to the server: tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")

Service

spec.type

spec.selector

ports.port && ports.targetPort

Local Servers

Why would you want to escape AWS?

Using extreme computing hardware. (ex: AI).

If your service relies on computing power itself as a source of income.

If you are a student or a hobbyist sick of paying per month to AWS. (Me)

Why did I chose to migrate some of my projects out of AWS?

The cost return factor with custom bought computers.

It is not all roses and flowers

Installing k8s in local servers

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

`docker build -t my-image-name .`

`FROM`

`WORKDIR`

`COPY`, `ENV`

`RUN` vs `CMD`

Expose vs `-p`

`docker-compose.yaml`

`spec.type`

`spec.selector`

`ports.port` && `ports.targetPort`

Packages