KubeSurvival allows you to significantly reduce your Kubernetes compute costs by finding the cheapest machine types that can run your workloads successfully.
If you have a multi-tenant environment, ML training jobs, a large number of ML model servers, etc, this tool can help you optimize your K8s compute costs.
To easily define workloads, KubeSurvival uses a very simple DSL:
( # Some microservice pod(cpu: 1, memory: "1Gi") + # Another microservice - with 3 replicas pod(cpu: "500m", memory: "2Gi") * 3 + # More microservices! ( pod(cpu: 1, memory: "1Gi") + pod(cpu: "250m", memory: "1Gi") ) * 3 ) * 2 # Production, Staging
This will give you a result such as:
Instance type: t3.medium Node count: 11 Total Price per Month: USD $340.45
Download a precompiled binary for your operating system from the Releases page.
Alternatively, if you have Go installed, you can run:
$ go install github.com/aporia-ai/kubesurvival/v2
To run KubeSurvival:
See the examples directory for example config files.
How does it work?
KubeSurvival uses k8s-cluster-simulator to simulate Kubernetes pod scheduling, without running on the actual underlying machines. It iterates over all possible instance types and node counts, simulates a K8s cluster with your workload, and checks if there are any pending pods.
For each simulation it calculates the on-demand cost per month using the ec2-instances-info library. Additionally, it queries the eni-max-pods.txt file to determine what's the maximum number of pods in each instance type.
When simulating a cluster, KubeSurvival always makes sure you have 10% free CPU and Memory on each node.
Finally, KubeSurvival selects the cheapest configuration without pending pods.
What's missing from this?
Well... a lot actually. Here's a partial list:
- Support for AKS and GKE
- Support for calculating costs of EBS storages
- Support for different node groups (e.g 2 machines with GPU + 4 machines without GPU)
- and probably much more!
We would love your help!