Skip to content

Commit

Permalink
Apply suggestions from code review
Browse files Browse the repository at this point in the history
Co-authored-by: Gaurav Padam <gauravpadam2@gmail.com>
  • Loading branch information
kvaps and Gauravpadam committed Feb 23, 2024
1 parent 848de81 commit ca314c0
Showing 1 changed file with 17 additions and 17 deletions.
Expand Up @@ -11,15 +11,15 @@ date: 2024-01-22

Have you ever thought about building your own cloud? I bet you have. But is it possible to do this using only modern technologies and approaches, without leaving the cozy Kubernetes ecosystem? Our experience in developing Cozystack required us to delve deeply into it.

You might argue that Kubernetes is not intended for this purpose and why not simple use OpenStack for bare-metal servers and run Kubernetes inside it as intended. But by doing so, you would simply shift the responsibility from your hands to the hands of OpenStack administrators. This would add at least one more huge and complex system to your ecosystem.
You might argue that Kubernetes is not intended for this purpose and why not simply use OpenStack for bare metal servers and run Kubernetes inside it as intended. But by doing so, you would simply shift the responsibility from your hands to the hands of OpenStack administrators. This would add at least one more huge and complex system to your ecosystem.

Why complicate things? - after all, Kubernetes already has everything need to run tenant Kubernetes clusters at this point.
Why complicate things? - after all, Kubernetes already has everything needed to run tenant Kubernetes clusters at this point.

I want to share with you our experience in developing a cloud platform based on Kubernetes. To shed light on the open-source projects that we use ourselves and believe deserve your attention.

In this series of articles, I will tell you our story about how we prepare managed Kubernetes from bare metal using only open-source technologies. Starting from the basic level of data center preparation, running virtual machines, isolating networks, setting up fault-tolerant storage to provisioning full-featured Kubernetes clusters with dynamic volume provisioning, load balancers, and autoscaling.

With this article, I want to kick off a series consisting of several parts:
With this article, I start a series consisting of several parts:

- **Part 1**: Preparing the groundwork for your cloud. Challenges faced during the preparation and operation of Kubernetes on bare metal and a ready-made recipe for provisioning infrastructure.
- **Part 2**: Networking, storage, and virtualization. How to turn Kubernetes into a tool for launching virtual machines and what is needed for this.
Expand All @@ -33,7 +33,7 @@ It is important to understand that the use of Kubernetes in the cloud and on bar

### Kubernetes in the Cloud

When you operate Kubernetes in the cloud, you don't worry about persistent volumes, cloud load balancers, or the process of provisioning nodes. All of this is handled by your cloud provider, who accepts your requests in the form of Kubernetes objects. In other words, the server side is completely hidden from you, and you don't really want to know how exactly the cloud provider implements it and you glad that it's not in your area of responsibility.
When you operate Kubernetes in the cloud, you don't worry about persistent volumes, cloud load balancers, or the process of provisioning nodes. All of this is handled by your cloud provider, who accepts your requests in the form of Kubernetes objects. In other words, the server side is completely hidden from you, and you don't really want to know how exactly the cloud provider implements as it's not in your area of responsibility.

![kubernetes in the cloud illustration](cloud.svg)

Expand All @@ -43,20 +43,20 @@ In the cloud, you always have several separate entities: the Kubernetes control

Thanks to Kubernetes, virtual machines are now only seen as a utility entity for utilizing cloud resources. You no longer store data inside virtual machines. You can delete all your virtual machines at any moment and recreate them without breaking your application. The Kubernetes control plane will continue to hold information about what should run in your cluster. The load balancer will keep sending traffic to your workload, simply changing the endpoint to send traffic to a new node. And your data will be safely stored in external persistent volumes provided by cloyd.

This approach is fundamental when using Kubernetes in the clouds. The reason for it is quite obvious: the simpler the system, the more stable it is, and for this simplicity you go buying Kubernetes in the cloud.
This approach is fundamental when using Kubernetes in clouds. The reason for it is quite obvious: the simpler the system, the more stable it is, and for this simplicity you go buying Kubernetes in the cloud.

### Kubernetes on bare metal

Using Kubernetes in the clouds is really simple and convenient, which cannot be said about bare-metal installations. In the bare-metal world, Kubernetes, on the contrary, becomes unbearably complex. Firstly, because the entire network, backend storage, cloud balancers, etc. are usually run not outside, but inside your cluster. As result such a system is much more difficult to update and maintain.
Using Kubernetes in the clouds is really simple and convenient, which cannot be said about bare metal installations. In the bare metal world, Kubernetes, on the contrary, becomes unbearably complex. Firstly, because the entire network, backend storage, cloud balancers, etc. are usually run not outside, but inside your cluster. As result such a system is much more difficult to update and maintain.

![kubernetes on bare metal illustration](baremetal.svg)

Judge for yourself: in the cloud, to update a node, you simply delete the virtual machine and create a new one from a new image. It will join the cluster and just work as a new node, a very simple and commonly used pattern in the Kubernetes world. Many order new virtual machines every few minutes, simply because they can use cheaper spot instances. However, when you have a physical server, you can't just delete and recreate it, firstly because it often runs some cluster services, stores data, and its update process is significantly more complicated.

There are different approaches to solving this problem, ranging from in-place updates, as done by kubeadm, kube-spray, and k3s, to full automation of provisioning physical nodes through Cluster API and Metal3.
There are different approaches to solving this problem, ranging from in-place updates, as done by kubeadm, kubespray, and k3s, to full automation of provisioning physical nodes through Cluster API and Metal3.

I like the hybrid approach offered by Talos Linux, where your entire system is described in a single configuration file. Most parameters of this file can be applied without rebooting or recreating the node, including the version of Kubernetes control-plane components. However, it still keeps the maximum declarative nature of Kubernetes.
This approach minimizes unnecessary impact on cluster services when updating bare-metal nodes. In most cases, you won't need to migrate your virtual machines and rebuild the cluster filesystem on minor updates.
This approach minimizes unnecessary impact on cluster services when updating bare metal nodes. In most cases, you won't need to migrate your virtual machines and rebuild the cluster filesystem on minor updates.

### Preparing a base for your future cloud

Expand Down Expand Up @@ -93,35 +93,35 @@ output:
outFormat: raw
```

Then you run simple docer command to build it into ready-mande image:
Then you run simple docker command to build it into ready-made image:

```
cat config.yaml | docker run --rm -i -v /dev:/dev --privileged "ghcr.io/siderolabs/imager:v1.6.4" -
```

And as a result, you will get an image with everythin you need, which you can use to install Talos Linux on your servers. This image will contain all the necessary firmware and kernel modules.
And as a result, you will get an image with everything you need, which you can use to install Talos Linux on your servers. This image will contain all the necessary firmware and kernel modules.

But the question arises, how do you deliver the freshly formed image to your nodes?

I have been contemplating the idea of PXE booting for quite some time. For example, the Kubefarm project that I [wrote article about](https://kubernetes.io/blog/2021/12/22/kubernetes-in-kubernetes-and-pxe-bootable-server-farm/) two years ago was entirely built on using this approach. But unfortunately, it does help you to deploy your very first parent cluster that will hold the others. So now we have prepared a simple solution that will help you do this the same using PXE approach.
I have been contemplating the idea of PXE booting for quite some time. For example, the Kubefarm project that I [wrote an article about](https://kubernetes.io/blog/2021/12/22/kubernetes-in-kubernetes-and-pxe-bootable-server-farm/) two years ago was entirely built using this approach. But unfortunately, it does help you to deploy your very first parent cluster that will hold the others. So now you have prepared a simple solution that will help you do this the same using PXE approach.

Essentially, all you need to do is run temporary DHCP and PXE servers in Docker containers. Then your nodes will boot from your image, and you can use a simple Debian-like script to help you bootstrap your nodes.

[![asciicast](https://asciinema.org/a/627123.svg)](https://asciinema.org/a/627123)

Source of this [talos-bootstrap](https://github.com/aenix-io/talos-bootstrap/) script available on GitHub
Source of this [talos-bootstrap](https://github.com/aenix-io/talos-bootstrap/) script is available on GitHub

This script allows you to deploy Kubernetes on bare metal in five minutes and obtain a kubeconfig for accessing it. However, many unresolved issues still lie ahead.

### Delivering System Components

At this stage, you already have a Kubernetes cluster capable of running various workloads. However, it is fully functional yet. In other words, you need to set up networking and storage, as well as install necessary cluster extensions, like KubeVirt to run virtual machines, as well the monitoring stack and other system-wide components.
At this stage, you already have a Kubernetes cluster capable of running various workloads. However, it is not fully functional yet. In other words, you need to set up networking and storage, as well as install necessary cluster extensions, like KubeVirt to run virtual machines, as well the monitoring stack and other system-wide components.

Traditionally, this is solved by installed Helm charts into your cluster. You can do this by running `helm install` commands locally, but this approach becomes inconvenient when you want track updates, and if you have multiple clusters and you want to keep them uniform. In fact, there are plenty of ways to do this declaratively. To solve this, I recommend using best GitOps practices. I mean tools like ArgoCD and FluxCD.
Traditionally, this is solved by installing Helm charts into your cluster. You can do this by running `helm install` commands locally, but this approach becomes inconvenient when you want to track updates, and if you have multiple clusters and you want to keep them uniform. In fact, there are plenty of ways to do this declaratively. To solve this, I recommend using best GitOps practices. I mean tools like ArgoCD and FluxCD.

While ArgoCD is more convenient for dev purposes with its graphical interface and a central control plane. FluxCD, on the other hand, is better suited for creating a Kubernetes distributions. With FluxCD, you can specify which charts with what parameters should be launched and describe dependencies. Then, FluxCD will take care of everything for you.
While ArgoCD is more convenient for dev purposes with its graphical interface and a central control plane, FluxCD, on the other hand, is better suited for creating a Kubernetes distributions. With FluxCD, you can specify which charts with what parameters should be launched and describe dependencies. Then, FluxCD will take care of everything for you.

It is suggested that performing a one-time installation of FluxCD in your newly created cluster and providing it with the configuration will cause it to install everything necessary, bringing the cluster to the expected state. For example, after installing Cozystack, you receive the following set of pre-installed Helm charts:
It is suggested to perform a one-time installation of FluxCD in your newly created cluster and provide it with the configuration. This will install everything necessary, bringing the cluster to the expected state. For example, after installing Cozystack, you receive the following set of pre-installed Helm charts:

By carrying out a single installation of FluxCD in your newly minted cluster and configuring it accordingly, you enable it to automatically deploy all the essentials. This will allow your cluster to upgrade itself into the desired state. For example, after installing our platform you'll see the next pre-configured Helm charts with system components:

Expand Down Expand Up @@ -155,4 +155,4 @@ cozy-victoria-metrics-operator victoria-metrics-operator 4m1s True Rele

As a result, you achieve a highly repeatable environment that you can provide to anyone, confident that it operates exactly as intended. This is actually what the [Cozystack](https://github.com/aenix-io/cozystack) project does, which you can try out for yourself absolutely free.

In the following articles, we will discuss how to prepare Kubernetes for running virtual machines and how to run Kubernetes clusters at the click of button. Stay tuned, it's gonna be fun!
In the following articles, I will discuss how to prepare Kubernetes for running virtual machines and how to run Kubernetes clusters at the click of a button. Stay tuned, it'll be fun!

0 comments on commit ca314c0

Please sign in to comment.