Skip to content
main
Switch branches/tags
Code

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
pkg
 
 
 
 
 
 
 
 
 
 
 
 
 
 

πŸ’° KubeSurvival

GitHub release (latest SemVer) GitHub Workflow Status Maintainability Test Coverage

KubeSurvival allows you to significantly reduce your Kubernetes compute costs by finding the cheapest machine types that can run your workloads successfully.

If you have a multi-tenant environment, ML training jobs, a large number of ML model servers, etc, this tool can help you optimize your K8s compute costs.

To easily define workloads, KubeSurvival uses a very simple DSL:

  (
    # Some microservice
    pod(cpu: 1, memory: "1Gi") + 

    # Another microservice - with 3 replicas
    pod(cpu: "500m", memory: "2Gi") * 3 +

    # More microservices!
    (
      pod(cpu: 1, memory: "1Gi") +
      pod(cpu: "250m", memory: "1Gi")
    ) * 3
  ) * 2  # Production, Staging

This will give you a result such as:

Instance type: t3.medium
Node count: 11
Total Price per Month: USD $340.45

Installation

Download a precompiled binary for your operating system from the Releases page.

Alternatively, if you have Go installed, you can run:

$ go install github.com/aporia-ai/kubesurvival/v2

Usage

To run KubeSurvival:

./kubesurvival config.yaml

See the examples directory for example config files.

How does it work?

KubeSurvival uses k8s-cluster-simulator to simulate Kubernetes pod scheduling, without running on the actual underlying machines. It iterates over all possible instance types and node counts, simulates a K8s cluster with your workload, and checks if there are any pending pods.

For each simulation it calculates the on-demand cost per month using the ec2-instances-info library. Additionally, it queries the eni-max-pods.txt file to determine what's the maximum number of pods in each instance type.

When simulating a cluster, KubeSurvival always makes sure you have 10% free CPU and Memory on each node.

Finally, KubeSurvival selects the cheapest configuration without pending pods.

What's missing from this?

Well... a lot actually. Here's a partial list:

  • Support for AKS and GKE
  • Support for calculating costs of EBS storages
  • Support for different node groups (e.g 2 machines with GPU + 4 machines without GPU)
  • and probably much more!

We would love your help! ❀️

About

πŸ’° Significantly reduce Kubernetes costs by finding the cheapest machine types that can run your workloads

Topics

Resources

License

Languages