This terraform module will install a High Availability K3s Cluster with Embedded DB in a private network on Hetzner Cloud. The following resources are provisionised by default (20€/mo):
- 3x Control-plane: CX11, 2GB RAM, 1VCPU, 20GB NVMe, 20TB Traffic.
- 2x Worker: CX21, 4GB RAM, 2VCPU, 40GB NVMe, 20TB Traffic.
- Network: Private network with one subnet.
- Server and agent nodes are distributed across 3 Datacenters (nbg1, fsn1, hel1) for high availability.
Hetzner Cloud integration:
- Preinstalled CSI-driver for volume support.
- Preinstalled Cloud Controller Manager for Hetzner Cloud for Load Balancer support.
Auto-K3s-Upgrades
We provide an example how to upgrade your K3s node and agents with the system-upgrade-controller. Check out /upgrade
What is K3s?
K3s is a lightweight certified kubernetes distribution. It's packaged as single binary and comes with solid defaults for storage and networking but we replaced local-path-provisioner with hetzner CSI-driver and klipper load-balancer with hetzner Cloud Controller Manager. The default ingress controller (traefik) has been disabled.
See a more detailed example with walk-through in the example folder.
Name | Description | Type | Default | Required |
---|---|---|---|---|
agent_groups | Configuration of agent groups | map(object({ |
{ |
no |
control_plane_server_count | Number of control plane nodes | number |
3 |
no |
control_plane_server_type | Server type of control plane servers | string |
"cx11" |
no |
create_kubeconfig | Create a local kubeconfig file to connect to the cluster | bool |
true |
no |
hcloud_csi_driver_version | n/a | string |
"v1.5.3" |
no |
hcloud_token | Token to authenticate against Hetzner Cloud | any |
n/a | yes |
k3s_version | K3s version | string |
"v1.21.3+k3s1" |
no |
kubeconfig_filename | Specify the filename of the created kubeconfig file (defaults to kubeconfig-${var.name}.yaml | any |
null |
no |
name | Cluster name (used in various places, don't use special chars) | any |
n/a | yes |
network_cidr | Network in which the cluster will be placed | string |
"10.0.0.0/16" |
no |
server_additional_packages | Additional packages which will be installed on node creation | list(string) |
[] |
no |
server_locations | Server locations in which servers will be distributed | list(string) |
[ |
no |
ssh_private_key_location | Use this private SSH key instead of generating a new one (Attention: Encrypted keys are not supported) | string |
null |
no |
subnet_cidr | Subnet in which all nodes are placed | string |
"10.0.1.0/24" |
no |
Name | Description |
---|---|
agents_public_ips | The public IP addresses of the agent servers |
control_planes_public_ips | The public IP addresses of the control plane servers |
k3s_token | Secret k3s authentication token |
kubeconfig | Structured kubeconfig data to supply to other providers |
kubeconfig_file | Kubeconfig file content with external IP address |
network_id | n/a |
ssh_private_key | Key to SSH into nodes |
If you need to cycle an agent, you can do that with a single node following this procedure. Replace the group name and number with the server you want to recreate!
Make sure you drain the nodes first.
terraform taint 'module.my_cluster.module.agent_group["GROUP_NAME"].random_pet.agent_suffix[1]'
terraform apply
This will recreate the agent in that group on next apply.
Currently you should only replace the servers which didn't initialize the cluster.
terraform taint 'module.my_cluster.hcloud_server.control_plane["#1"]'
terraform apply
Install the system-upgrade-controller in your cluster.
KUBECONFIG=kubeconfig.yaml kubectl apply -f ./upgrade/controller.yaml
- Mark the nodes you want to upgrade (The script will mark all nodes).
KUBECONFIG=kubeconfig.yaml kubectl label --all node k3s-upgrade=true
- Run the plan for the servers.
KUBECONFIG=kubeconfig.yaml kubectl apply -f ./upgrade/server-plan.yaml
Warning: Wait for completion before you start upgrading your agents.
- Run the plan for the agents.
KUBECONFIG=kubeconfig.yaml kubectl apply -f ./upgrade/agent-plan.yaml
K3s will automatically backup your embedded etcd datastore every 12 hours to /var/lib/rancher/k3s/server/db/snapshots/
.
You can reset the cluster by pointing to a specific snapshot.
- Stop the master server.
sudo systemctl stop k3s
- Restore the master server with a snapshot
./k3s server \
--cluster-reset \
--cluster-reset-restore-path=<PATH-TO-SNAPSHOT>
Warning: This forget all peers and the server becomes the sole member of a new cluster. You have to manually rejoin all servers.
- Connect you with the different servers backup and delete
/var/lib/rancher/k3s/server/db
on each peer etcd server and rejoin the nodes.
sudo systemctl stop k3s
rm -rf /var/lib/rancher/k3s/server/db
sudo systemctl start k3s
This will rejoin the server with the master server and seed the etcd store.
Info: It exists no official tool to automate the procedure. In future, rancher might provide an operator to handle this (issue).
Cloud init logs can be found on the remote machines in:
/var/log/cloud-init-output.log
/var/log/cloud-init.log
journalctl -u k3s.service -e
last logs of the serverjournalctl -u k3s-agent.service -e
last logs of the agent
- Sometimes at cluster bootstrapping the Cloud-Controllers reports that some routes couldn't be created. This issue was fixed in master but wasn't released yet. Restart the cloud-controller pod and it will recreate them.
- terraform-hcloud-k3s Terraform module which creates a single node cluster.
- terraform-module-k3 Terraform module which creates a k3s cluster, with multi-server and management features.
- Icon created by Freepik from www.flaticon.com
$ terraform apply
var.hcloud_token
Token to authenticate against Hetzner Cloud
Enter a value: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
var.name
Cluster name (used in various places, don't use special chars)
Enter a value: k3s
...
Enter a value: yes
...
Apply complete! Resources: 17 added, 0 changed, 0 destroyed.
Outputs:
agents_public_ips = [
"49.12.225.212",
"23.88.123.16",
]
control_planes_public_ips = [
"23.88.107.54",
"49.12.5.242",
"65.108.52.118",
]
k3s_token = <sensitive>
kubeconfig = <sensitive>
kubeconfig_file = <sensitive>
network_id = "1260727"
ssh_private_key = <sensitive>
Check k8s:
$ export KUBECONFIG=./kubeconfig-k3s.yaml
$ kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k3s-control-plane-0 Ready control-plane,etcd,master 3m v1.21.3+k3s1 10.0.1.1 23.88.107.54 Ubuntu 20.04.3 LTS 5.4.0-89-generic containerd://1.4.8-k3s1
k3s-control-plane-1 Ready control-plane,etcd,master 71s v1.21.3+k3s1 10.0.1.2 49.12.5.242 Ubuntu 20.04.3 LTS 5.4.0-89-generic containerd://1.4.8-k3s1
k3s-control-plane-2 Ready control-plane,etcd,master 88s v1.21.3+k3s1 10.0.1.3 65.108.52.118 Ubuntu 20.04.3 LTS 5.4.0-89-generic containerd://1.4.8-k3s1
k3s-default-0-expert-marten Ready <none> 101s v1.21.3+k3s1 10.0.1.33 49.12.225.212 Ubuntu 20.04.3 LTS 5.4.0-89-generic containerd://1.4.8-k3s1
k3s-default-1-suited-hawk Ready <none> 109s v1.21.3+k3s1 10.0.1.34 23.88.123.16 Ubuntu 20.04.3 LTS 5.4.0-89-generic containerd://1.4.8-k3s1
$ kubectl get all --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system pod/coredns-7448499f4d-w9ztt 1/1 Running 0 2m53s
kube-system pod/hcloud-cloud-controller-manager-74b74b9b46-hk4wq 1/1 Running 0 2m48s
kube-system pod/hcloud-csi-controller-0 5/5 Running 0 2m47s
kube-system pod/hcloud-csi-node-b6fdc 3/3 Running 0 2m46s
kube-system pod/hcloud-csi-node-mlhnx 3/3 Running 0 92s
kube-system pod/hcloud-csi-node-qzp2t 3/3 Running 0 2m9s
kube-system pod/hcloud-csi-node-sgwrw 3/3 Running 0 109s
kube-system pod/hcloud-csi-node-w9556 3/3 Running 0 2m2s
kube-system pod/metrics-server-86cbb8457f-sw4rm 1/1 Running 0 2m53s
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 3m13s
kube-system service/hcloud-csi-controller-metrics ClusterIP 10.43.239.98 <none> 9189/TCP 2m47s
kube-system service/hcloud-csi-node-metrics ClusterIP 10.43.251.254 <none> 9189/TCP 2m46s
kube-system service/kube-dns ClusterIP 10.43.0.10 <none> 53/UDP,53/TCP,9153/TCP 3m10s
kube-system service/metrics-server ClusterIP 10.43.228.154 <none> 443/TCP 3m8s
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-system daemonset.apps/hcloud-csi-node 5 5 5 5 5 <none> 2m47s
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
kube-system deployment.apps/coredns 1/1 1 1 3m11s
kube-system deployment.apps/hcloud-cloud-controller-manager 1/1 1 1 2m49s
kube-system deployment.apps/metrics-server 1/1 1 1 3m10s
NAMESPACE NAME DESIRED CURRENT READY AGE
kube-system replicaset.apps/coredns-7448499f4d 1 1 1 2m55s
kube-system replicaset.apps/hcloud-cloud-controller-manager-74b74b9b46 1 1 1 2m49s
kube-system replicaset.apps/metrics-server-86cbb8457f 1 1 1 2m55s
NAMESPACE NAME READY AGE
kube-system statefulset.apps/hcloud-csi-controller 1/1 2m48s
Check LB $ PVC :
$ kubectl apply -f manifests/hello-kubernetes-default.yaml
deployment.apps/hello-kubernetes created
service/hello-kubernetes created
persistentvolumeclaim/csi-pvc created
$ kubectl get all
NAME READY STATUS RESTARTS AGE
pod/hello-kubernetes-6f8d7694bc-xstlz 1/1 Running 0 2m38s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/hello-kubernetes LoadBalancer 10.43.231.169 10.0.1.4,162.55.152.168,2a01:4f8:1c1d:178::1 8080:30797/TCP 2m38s
service/kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 9m34s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/hello-kubernetes 1/1 1 1 2m38s
NAME DESIRED CURRENT READY AGE
replicaset.apps/hello-kubernetes-6f8d7694bc 1 1 1 2m38s
Clean:
delete terraform destroy
and delete LB & Volume during destroy ->
$ terraform destroy:
var.hcloud_token
Token to authenticate against Hetzner Cloud
Enter a value: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
var.name
Cluster name (used in various places, don't use special chars)
Enter a value: k3s
...
Enter a value: yes
Destroy complete! Resources: 17 destroyed.