This project uses Cilium and Talos to provision a Kubernetes cluster mesh running on Proxmox SDN.
A Cluster Mesh extends Kubernetes to allow application deployment and administration across multiple clusters.
Refer to Cluster Mesh Cookbook for detailed instructions for using Aggrik8s-net/aggrik8s-cluster.
Our IaC stack provisions immutable Kubernetes Cluster meshes to allow digital twin
infrastructure.
Benefits of a turn-key infrastructure plaform:
- infrastructure repeatability, reliability, and transparency,
Blue Green Deployments
,Disaster Recovery
,Follow the Sun Operations Centers
.
Our IaC platform provisions turn-key cluster meshes - this is game changer.
- we can create
digital twins
turn-key infrastructure to meet Development, Staging, and Production requirements.
This repository contains an IaC stack to spin up turn-key Kubernetes clusters mesh. Cilium uses eBPF to implement Kubernetes CNI and add Observability toolssuch as:
Talos is an immutable Linux distribution purpose built to run Kubernetes - it is configured using a single YAML
and there no ssh
.
Cilium is an eBPF based Kubernetes CNI which improves scalability, cost efficiency, and observability of the cluster.
We use a combination of Terraform and Ansible to provision and administer our platform.
The bbtechsys/terraform-proxmox-talos Terraform module spins up Talos clusters using bpg/terraform-provider-proxmox to provision Talos VMs and siderolabs/terraform-provider-talos to configure those VMs as our Kubernetes cluster's control-plane
and worker
nodes.
The stack uses DopplerHQ/terraform-provider-doppler to create and inject secrets used by hashicorp/terraform-provider-kubernetes to install k8s bits (such as Cilium CRD manifests) and hashicorp/terraform-provider-helm for helm charts support. We use both Terraform and Ansible to provision resources such as rook-ceph
and robusta.
- Document ARMO before trial ends (3 days ?),
- Terraform Mikrotik Fabrik to support multiple AZ model, this is required for Ciliumm Cluster Mesh develoopment,
- add
piCluster
, our RaspberryPi 5 based rancherfederal/rk2-ansible cluster to the Cilium Cluster Mesh. - Consider
talm
to manageCozyStack
PaaS-Full clusters to leversge Day-2 support. - Document Cilium Debug Tooling:
- Deploy the Starwars applicatiopn using CI/CD,
- Use Hubble UI to explore Starwars app,
- Use Tetragon to explore Starwars app,
- Refactor the network layer:
- Use VLAN to provision multiple NICs on our Talos nodes,
- Provide VXLAN overlay using Mikrotik,
- Providie BGP using Mikrotik and Cilium.
The Terraform stack works but requires bash helper scripts to orchestrate multiple terraform apply --target <foo>
commands required to handle dependency tracking gaps in the Terraform plan phase. helper scripts for setting up Talos & Kubernetes credentials as well as installing Cilium.
The stack will be fully automated once the best integration strategy is determined. For instance, Helm can be usind to install Cilium or we can use the Cilium CLI which properly handles complicated scenarios not properly handled using Helm.
We have verified the reusability of existing Ansible Playbooks
to install Day 2 Services
such as Robusta, Ollama and Honeycomb OTEL.
We use Terraform to provision Cilium Mesh of Talos based Kubernetes clusters.
- We spin up Kubernetees clusters using bbtechsys/talos/proxmox" which uses:
- Proxmox VMs are provisioned using bgp/terraform-provider-proxmox,
- Talos nodes and clusters managed using siderolabs/talos Terraform provider.
- Talos
Image Factory
generation ofcontrol-plane
andworker-node
configurations are patched to handle our requirements.
- DopplerHQ/doppler manages Secrets for Terraform and Kubernetes,
- CSI ObjectSore, BlockStore, and FileSystem services using rook-ceph on Talos and digitalocean/csi-digitalocean,
- CNI wired up following Cilium on Talos provides Cilium features,
- Ansible Playbooks for Day-2 services such as
Honeycomb OTEL
,Robusta
,OLMv1
will be merged into the stack over time.
- Hubble for network traffic analysis (see IP .
- Tetragon for SecOps (see system calls)
- Robusta for Cloud based Cluster DevOps workflows.
Groundcover
for Inversion of Cost for OTEL Cloud storage. They only ingest metadaata, all actual OTEL data remains in cluster.
- Ollama