Minik8s

Minik8s is a mini container orchestration tool similar to Kubernetes, which can manage containers that meet the CRI interface on multiple machines. It supports basic functions such as container lifecycle management, dynamic scaling, automatic scaling, and provides serverless platform integration based.

The specific requirements can be referred to the following requirement document.

All video demonstrations mentioned in this project are available for viewing and download on Google Drive.

Team Leader: Chen Bao @ Kami-code

Team Member: Huidong Xu @ WilliamX1

Team Member: Yixiang Liu @ liuyixiang42

Architecture

Overall Architecture

The architecture view of Minik8s is shown below, which mainly refers to the Minik8s best practice architecture provided in class. It can be divided into three parts: user space, control plane, and worker nodes.

In the user space, we provide three executable scripts for users to meet different needs. These three scripts will eventually make requests to the API server, which schedules and forwards them according to corresponding logic. We have provided numerous controllers to monitor the corresponding states. For kubelet's requests, the API server will forward them to the corresponding worker.

In terms of multi-node deployment, each worker is configured with Flannel, which connects the IP allocation and forwarding across multiple nodes.

Design Principle

We deploy in a master-worker node pattern which is not limited to virtual machines, but rather a logical concept where a single host can act as both master and worker nodes.

On the master node, we need to run /master/api_server.py for interacting with various components, /master/replica_set_controller.py for handling requests pertaining to ReplicaSet, /master/service_controller.py for handling requests pertaining to Service, /master/dns_controller.py for handling requests related to Dns, and /master/garbage_collector.py for recovering failed request garbage information.

On the worker node, we only need to run /worker/kubelet_flask.py to handle requests sent by the master node. We can also interact with the cluster through the command line by running /userland/kubectl.py.

This master-worker node setting makes it very easy for multiple machines (not limited to 2) to collectively form a cluster.

Directory Structure

/.workflow: CI/CD config file
/doc: project documents
/helper: global parameters and functions
- const.py: global parameters, including some directory paths
- utils.py: global functions, mainly including the function to change the item of iptables
- yaml_loader.py: functions to load yaml file
/master: directory for master node
- /dns: DNS forwarding using Nginx Service
  - /nginx
    - /conf: map to /etc/nginx/conf.d/ of the Nginx container, producing *.conf when DNS is created.
    - /html: map to /usr/share/nginx/html/of the Nginx container.
    - /log: map to /var/log/nginx of the Nginx container, producing error log for debug.
  - dns-nginx-server-replica-set.yaml: The basic information YAML file for Nginx, such as the number of Pods.
  - dns-nginx-server-service.yaml: The basic information YAML file for the Nginx service, such as the virtual IP address and exposed port of the service.
- api_server.py: The api server of minik8s uses Flask to implement inter-network communication and uses Etcd for persistent storage.
- dns_controller.py: the Dns processor of minik8s, responsible for creating, updating, and deleting user-uploaded Dns records.
- etcd_controller.py: the etcd daemon used by minik8s for fault tolerance.
- garbage_collector.py: the garbage collector of minik8s, responsible for recovering failed Pod scheduling.
- node_controller.py: the Node processor of minik8s, responsible for adding and deleting nodes.
- replica_set_controller.py: the Replicaset processor of minik8s, responsible for creating, scheduling, and deleting user-uploaded ReplicaSets.
- serverless.py: the Serverless implementation logic of minik8s.
- service_controller.py: the Service processor of minik8s, responsible for creating, updating, and deleting user-uploaded Services.
/useland: directory for the user
- /final_check: directory for final release, including some yaml files for demonstrations
- /frontend: the front-end used to demonstrate Serverless.
- /gpu: the GPU support part of minik8s.
- /parameters: the yaml file parameters for GPU demonstration at final defense.
- /user_serverless_scripts: the add and multiply scripts for GPU.
- /yaml_default: the user-defined yaml files.
- kubectl.py: the user interactive command line of minik8s.
- kubectl_gui.py: the user graphical interactive interface of minik8s, mainly used for selecting and uploading yaml files.
/worker: directory for worker node
- /HPA_test_docker
- /gpu: code supporting GPU applications
- /multi_machine
  - /configs: the config of Flannel, which can be used as the subnet for allocating global IP addresses, and the default is 20.20.0.0/16.
  - /etcd: etcd for persistent storage
  - /scripts: some shell scripts, mainly used to config Flannel and Docker when starting kubelet_flask
- /nodes_yaml: the yaml file for user to define node
- /sources: some resources
- entities.py: some abstraction class
- kubedns.py: functions for DNS and forwarding
- kubelet_flask.py: functions to run the worker like k8s
- kubeproxy.py: utility functions for Service
requirements.txt: project package requirements
run.sh: CI/CD running scripts

Installation

Make sure your node is connected to the Internet and can be accessed via the intranet, and install Docker and Anaconda.
Configuring Security Group. Add inbound and outbound rules for IP segments to prevent network requests from being blocked. For example, add inbound and outbound rules for 192.168.0.0/16, 172.17.0.0/16, and 20.0.0.0/8.
Configure the conda environment and activate it.

$ conda create -n minik8s python=3.8
$ conda activate minik8s
$ vim ~/.bashrc
$ source ~/.bashrc

Clone this repository and install the required packages.

$ git clone https://gitee.com/Leimak/minik8s
$ cd minik8s
$ git checkout -b master origin/master
$ pip uninstall protobuf
$ pip install -r requirements.txt

Usage

Deploying Minik8s on Multiple Hosts

[Video Demonstration]

We use Flask for network communication to implement message passing between worker nodes and master nodes, and use Flannel + etcd to ensure that the IP addresses assigned to Pods on multiple hosts are globally unique.

We deploy two virtual machines, where "minik8s-1" serves as both the master and worker node, while "minik8s-2" only serves as a worker node.

Node Type	Hostname	Private IP Address	Floating IP Address
master + worker	minik8s-1	192.168.1.12	11.119.11.120
master	minik8s-2	192.168.1.5	11.119.10.16

The configuration file for the master node, /worker/multi_machine/nodes_yaml/master.yaml , is shown below with explanations in comments.

ETCD_NAME: etcd  # the name of etcd, --name etcd
IP_ADDRESS: http://10.119.11.120  # the floating IP address of node running etcd
ETCD_INITIAL_CLUSTER: etcd=http://10.119.11.120:2380  # etcd initial cluster
ETCD_INITIAL_CLUSTER_STATE: new  # the state of etcd
API_SERVER_URL: http://192.168.1.12:5050  # the IP address of api server

The configuration file for the worker node, /worker/multi_machine/nodes_yam/worker1.yaml , is shown below with explanations in comments.

MASTER_ETCD_CLIENT_URL: http://10.119.11.120:2379  # the ip of master
IP_ADDRESS: http://10.119.11.120  # the floating ip of each worker
API_SERVER_URL: http://192.168.1.12:5050  # master api server 的 ip
WORKER_PORT: 5051  # the port of worker

Before running, we can execute /worker/multi_machine/scripts/clean-shell.sh to clean up any residual etcd, flannel, and docker processes that may be present in the system.

After ensuring that there are no related processes, we first run the API server on minik8s-1. The screenshot below shows that the master is running at 127.0.0.1:5050 on this machine and at 192.168.1.12:5050 on other machines, indicating that the API server has been successfully started.

$ python3 ./master/api_server.py

Next, we will run kubelet_flask.py on two virtual machines separately and pass the configuration file as a parameter. It can be seen that the two workers are running on port 5051 on their respective machines, which corresponds with the WORKER_PORT in the Node configuration file, indicating that joining the cluster was successful.

$ python3 ./worker/kubelet_flask.py worker1 # run on minik8s-1
$ python3 ./worker/kubelet_flask.py worker2 # run on minik8s-2

We run /userland/kubectl.py to interact with the cluster via the command line, and use the show nodes command to obtain the status of the Node, including the unique identifier name, current status Running, current IP working_url, current virtual memory total_memory , current memory usage rate memory_use_percent and CPU usage rate cpu_use_percent.

$ python3 ./userland/kubectl.py
>>> show nodes

Since now, the multi-node deployment has been completed.

Running Pods and Monitoring the Lifecycle of Containers

[Video Demonstration1], [Video Demonstration2], [Video Demonstration3], [Video Demonstration4]

As the smallest scheduling unit in minik8s, Pod is a collection of multiple Docker containers. Therefore, we designed a Pod class to more clearly describe and store Pod information and use Python Docker SDK to start specific containers.

In terms of networking, we additionally start a pause container for each Pod. All user-defined containers share the network stack with this pause container which enables network communication between different containers.

We can use kubectl to upload and create Pods through commands, or we can use kubectl_gui to upload and create Pods with a graphical interface. Both methods will verify whether the file path exists and whether the contents in the YAML file are valid. Note that we use the $ symbol to represent the root directory of the project, which in this case is /home/xhd/Desktop/minik8s/.

>>> start -f $/userland/final_check/pod-1.yaml

We can use kubelet to run the show pods command to view the abstract configuration of Pods, which includes the Pod's name (name), globally unique instance_name, running status (status), creation time (created time), globally unique IP address, mounted data volume (volume), exposed ports (ports), total CPU limit (cpu), total memory limit (mem), and the node to which it is assigned (Node).

>>> show pods

When checking the actual running status of the containers, we found that the user-defined nginx, jetty and busybox containers have all started successfully. Furthermore, we can see that the busybox image executed the user-defined sh -c 'sleep 360000000' command.

$ docker ps -a

The Nginx container exposes port 80, and the Jetty container exposes port 8080. We can use curl between these two containers to verify communication between multiple containers within the Pod using localhost. Note that Jetty is like Tomcat, used for quickly compiling and deploying WAR files. As we are running an empty Jetty container here, its homepage displays the text "Error 404 - Not Found". This also means that we can access this port through the local network since the corresponding HTML source code would not appear if it was not accessible.

$ docker exec nginx-xxxx curl localhost:80
$ docker exec nginx-xxxx curl localhost:8080
$ docker exec jetty-xxxx curl localhost:80
$ docker exec jetty-xxxx curl localhost:8080

We have implemented two scheduling strategies: random and round-robin. When a node cannot be scheduled, the remaining nodes will either be selected at random or in a round-robin fashion until a suitable node is found (unsuitability may be due to CPU or memory limitations exceeding current node conditions). If no nodes are suitable, the scheduling will fail and the Pod will not be created.

Running and Access Service with Virtual IP Address

[Video Demonstration1], [Video Demonstration2]

Service is a virtual Kubernetes abstraction that sits on top of multiple Pods. The challenge is how to strategically forward user access to custom virtual IPs to the actual Pod IPs. We refer to Kubernetes' modifications to iptables, and have appropriately trimmed some of the filtering rules. We use the method of directly executing shell commands to write the virtual machine's iptables to achieve Service accessibility.

We can create a Service using the kubectl command line or the kubectl_gui graphical interface.

>>> start -f $/userland/final_check/service-1.yaml
$ python3 ./master/service_controller.py

We can demonstrate the multi-machine accessibility of a Service by requesting ClusterIP:Port from different virtual machines, and also from the host machine and container.

For example, we can request 192.168.88.88:88 from both the host machine and the container in minik8s-1.

(minik8s-1) $ curl 192.168.88.88:88
(minik8s-1) $ docker exec nginx-xxx curl 192.168.88.88:88

We can request 192.168.88.88:88 from both the host machine and the container in minik8s-2.

(minik8s-2) $ curl 192.168.88.88:88
(minik8s-2) $ docker exec nginx-xxx curl 192.168.88.88:88

We find that we can access the corresponding Service through the virtual IP and port locally and in the container on both machines.

Running Replica Set on Pods

[Video Demonstration]

The challenge in abstracting a ReplicaSet is how to monitor the state of the Pods. Therefore, the kubelet of each worker node sends periodic heartbeats to the API Server indicating the status of the Pods running on that node. The ReplicaSet Controller then monitors the state of the Pods and ensures that the number of replicas matches the expected value.

We can create a ReplicaSet using the kubectl command line or the kubectl_gui graphical interface. To show the current ReplicaSets and Pods, we can use the following commands:

>>> start -f $/userland/final_check/repliceset-1.yaml
>>> show replicasets
>>> show pods

We can see that we have created three Pods. Then, we can use the docker rm command to remove the container maintained by one of the Pods.

At this point, we may notice that one of the Pods is showing as "Failed", and a new Pod has just been created.

Running Auto Scaling on Replica Set

[Video Demonstration]

HPA is mainly implemented based on ReplicaSet. Before the ReplicaSet Controller executes its logic, the number of replicas for the ReplicaSet corresponding to HPA will be dynamically modified based on the HPA metrics.

We can create a ReplicaSet using the graphical interface. Then, we can use the following commands to show the current HPA and Pods. At this point, four Pods have been created:

>>> start -f $/userland/yaml_default/hpa_test.yaml

After reducing the load, we can see that the number of Pods has decreased to 3.

Forwarding using DNS

[Video Demonstration]

The key to DNS forwarding is to correctly resolve the domain name to the virtual IP of the corresponding Service. To support multiple sub-paths corresponding to different Services, we set up an Nginx service as a reverse proxy.

First, we create two Services as follows:

name	clusterIP	ports	pod
pod-1-2-service	192.168.88.88	88 -> 80/tcp	pod-1, pod-2
pod-3-4-5-service	192.168.99.99	99 -> 80/tcp	pod-3, pod-4, pod-5

We used kubelet to create the DNS setup described above. Since we did not use a plugin for DNS forwarding and instead used an Nginx container as a reverse proxy, we essentially wrapped the Nginx forwarding into a Service and wrote the domain name and Nginx ClusterIP to the /etc/hosts file on the host and container. This allows requests to be routed through the Nginx reverse proxy when the host or container does not recognize the domain name.

>>> start -f $/userland/final_check/dns-1.yaml
$ python3 dns_controller.py

We can see that there are two DNS records, each corresponding to a different path defined in the DNS file.

We can check in /master/dns/nginx/conf/ and find that a minik8s-dns.conf file has been generated with the following contents:

server {
  listen: 80
    server_name minik8s-dns;
    location /pod-1-2-service {
    proxy_pass http://192.168.88.88:88/;
  }
  location /pod-3-4-5-service {
    proxy_pass http://192.168.99.99:99/;
  }
}

Then, we can check the /etc/hosts files on the host and in the container to see that the relevant entries have been added.

Then, we can make requests to the different paths corresponding to this domain name from both the host and container on each of the two virtual machines.

This verifies that we can access the services provided by the Services mapped to the domain name through DNS resolution and forwarding.

Running GPU Applications

[Video Demonstration]

Firstly, we upload the yaml file for the job and the corresponding user CUDA file to the API Server using the following command:

>>> upload job -f ./gpu/gpu.yaml
>>> upload job -f ./gpu/data/add.cu
>>> upload job -f ./gpu/data/add.slurm

Then, we use a command to start and submit the job:

>>> start job add
>>> submit job add

Next, we can use the following command to obtain the execution results:

download job add -f ./data

Run Serverless Cloud Functions

[Video Demonstration]

To achieve function abstraction, we use Python Flask as the backend to receive requests forwarded by kubectl from the API Server. The format of the user-defined function is as follows:

# my_module.py
def my_function(event: dict, context: dict)->dict:
    return {"result": "hello {}{}!".format(event, context)}

The parameters passed in the Python environment include event and context, both of which are dictionary types.

event: Use this parameter to pass the trigger event data.
context: Use this parameter to pass runtime information to your handler.

The return value is also a dictionary type, and the "result" field describes the actual return value.

Next, we will build the Serverless Server. Whenever a request arrives at the API Server, if the API Server finds that there is no running instance of the Serverless Server, it creates a new instance through the dockerfile; if there is a running instance, it forwards the HTTP request.

Next, we need to wrap the Dockerfile according to the "my_module.py" file passed in by the user, use the Dockerfile to build an image.

FROM ubuntu
MAINTAINER xxx <user@example.org>
RUN DEBIAN_FRONTEND=noninteractive apt-get -y update
RUN DEBIAN_FRONTEND=noninteractive apt-get -y install python3 python3-pip
RUN pip install -r requirements.txt
RUN pip install flask
ADD ./serverless_server.py serverless_server.py
ADD ./my_module.py my_module.py
CMD ["python3","/serverless_server.py"]

We use the following command to build and run the image:

docker build -t serverless_test .
docker run -p 5000:5000 serverless_test

The results are as follows:

(base) baochen@baochen-Lenovo-Legion-Y7000-1060:~/Desktop/serverless_test$ docker run -p 5000:5000 serverless_test
 * Serving Flask app 'serverless_server' (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on all addresses (0.0.0.0)
   WARNING: This is a development server. Do not use it in a production deployment.
 * Running on http://127.0.0.1:5000
 * Running on http://172.17.0.2:5000 (Press CTRL+C to quit)

We access the URL "http://127.0.0.1:5000/function/my_module/my_function" on the host machine, and we can see the following result:

Using Serverless DAG to Specify Transition Condition and Path

We designed a frontend for a directed acyclic graph (DAG) that supports custom serverless functions using the React framework. Users can set up functions by adding or deleting nodes and can set transition conditions along the path.

We use the following command:

>>> upload function -f ./user_serverless_scripts/add.py
>>> upload function -f ./user_serverless_scripts/multiply.py
>>> show functions
>>> trigger function serverless-add -p ./parameters/add_param.yaml

The parameters we passed in are a=5, b=3, and the add function will automatically add the two values together and return the result as result. When we make multiple requests, we find that the Serverless Server is automatically scaled out and load balanced.

Next, we construct a Workflow in DAG_uploader as follows:

The condition for the add.add node to jump to the output node is result < 11, while the condition for jumping to the multiply.multiply node is result >= 11. We first execute the Workflow with the parameters a=5, b=3:

As we can see, the add.add node took the path where result < 11 and output directly.

Next, we change the initial parameters and execute the Workflow with a=5, b=7:

We can see that the add.add node, due to result >= 11, continued to enter the multiply.multiply node, and the final output is the product of the two values.

References

Python Docker SDK

Kubernetes 源码分析

Kubernetes Services Networking

Kubernetes Service 类型介绍

https://level-up.one/category/kubernetes/

https://matthewpalmer.net/kubernetes-app-developer/articles/kubernetes-networking-guide-beginners.html

https://zhuanlan.zhihu.com/p/90992878

https://blog.csdn.net/qq_41861526/article/details/97621144

https://www.velotio.com/engineering-blog/flannel-a-network-fabric-for-containers

https://cloud.tencent.com/developer/article/1603511

https://dustinspecker.com/posts/iptables-how-kubernetes-services-direct-traffic-to-pods/

https://mvallim.github.io/kubernetes-under-the-hood/documentation/kube-flannel.html

https://docker-k8s-lab.readthedocs.io/en/latest/docker/docker-flannel.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Minik8s

Table of Contents

Architecture

Overall Architecture

Design Principle

Directory Structure

Installation

Usage

Deploying Minik8s on Multiple Hosts

Running Pods and Monitoring the Lifecycle of Containers

Running and Access Service with Virtual IP Address

Running Replica Set on Pods

Running Auto Scaling on Replica Set

Forwarding using DNS

Running GPU Applications

Run Serverless Cloud Functions

Using Serverless DAG to Specify Transition Condition and Path

References

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.workflow		.workflow
README		README
doc		doc
helper		helper
master		master
userland		userland
worker		worker
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
run.sh		run.sh

Kami-code/minik8s

Folders and files

Latest commit

History

Repository files navigation

Minik8s

Table of Contents

Architecture

Overall Architecture

Design Principle

Directory Structure

Installation

Usage

Deploying Minik8s on Multiple Hosts

Running Pods and Monitoring the Lifecycle of Containers

Running and Access Service with Virtual IP Address

Running Replica Set on Pods

Running Auto Scaling on Replica Set

Forwarding using DNS

Running GPU Applications

Run Serverless Cloud Functions

Using Serverless DAG to Specify Transition Condition and Path

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages