CloudKnit: An Open Source Solution for Managing Cloud Environments
CloudKnit is an open-source
progressive delivery platform for managing cloud environments. It enables organizations to Define entire environments in a declarative way, Provision them, Detect and Reconcile Drift, and Teardown environments when no longer needed. It also comes with dashboards to help visualize environments and observe them.
CloudKnit is based on a concept called Environment as Code. Some people have started calling it Declarative Pipelines.
Note: We are not a big fan of using Pipeline and Declarative together as Pipeline to us means a sequence of steps which conflicts with what Declarative means.
Environment as Code (EaC) is an abstraction over Cloud-native tools that provides a declarative way of defining an entire Environment. It has a Control Plane that manages the state of the environment, including resource dependencies, and drift detection and reconciliation.
Why we built CloudKnit
There are tools today that allow us to manage cloud environments but as the environments become more complex and teams look for advanced use cases, existing tools can fall short. This causes some teams to build and maintain in-house solutions.
We want to make it easy for DevOps and Platform Engineering teams to manage complex environments. Existing cloud-native tools like Terraform, Pulumi, Helm, ArgoCD, etc. on their own can be great at managing individual components within an environment or automating simple environments. However an environment like the one below (Diagram 2) with Infrastructure resources and cloud-native applications requires a different solution. Currently you have two main approaches:
Option 1: Monolith Infrastructure as Code & Application Deployment
Monolith IaC & Application deployments work well when environments are simple, but as things get complex, it becomes a nightmare to maintain. It creates tight coupling and causes issues like slow provisioning.
Option 2: Use loosely coupled IaC & Application Deployment & hand-roll pipelines to run them
Creating loosely-coupled components like networking, eks, rds, backend-apps, frontend-apps, etc. makes the individual components easier to manage, but the pipelines that run those components become complex. Pipeline code needs to manage the logic to run the various components in the correct order, handle failures and tear down unused resources.
Pipeline code is imperative, and users have to write logic on “HOW” to get an entire environment. This causes a maintenance nightmare. We have seen teams write hundreds of lines of unmaintainable pipeline code.
There are other challenges that teams face as their environments become more complex.
- Environment Replication is a pain
- Not easy to Visualize/Understand Environments
- Drift Detection for the entire environment is difficult
- Not straightforward to Promote changes across environments
How does CloudKnit work?
Environment management with CloudKnit is divided into 4 stages:
This stage allows you to define an entire environment. We currently support easy to use YAML format for the environment definition.
See example below:
apiVersion: stable.cloudknit.com/v1 kind: Environment metadata: name: zmart-payment-prod-blue namespace: zmart-config spec: teamName: payment envName: prod-blue components: - name: networking type: terraform autoApprove: true module: source: firstname.lastname@example.org:terraform-aws-modules/terraform-aws-vpc.git variablesFile: path: "prod-blue/vars/networking.tfvars" outputs: - name: vpc_id - name: platform-eks type: terraform dependsOn: [networking] module: source: email@example.com:terraform-aws-modules/terraform-aws-eks.git variables: - name: vpc_id valueFrom: networking.vpc_id variablesFile: path: "prod-blue/vars/platform-eks.tfvars" - name: website type: helm #Native support for helm chart coming soon dependsOn: [platform-eks] source: repo: firstname.lastname@example.org:helm/examples.git path: charts/hello-world variables: - name: environment value: prod-blue
CloudKnit Control Plane running in Kubernetes uses the Environment definition & runs various Components (Terraform, Helm Charts, etc.) in the right order. It also provides Visibility & Workflow while Provisioning the environment.
3. Detect Drift + Reconciliation
Like Kubernetes does drift detection for k8s apps & reconciles them to match the desired state in source control, CloudKnit does drift detection for the entire environment (infra + apps) & reconciles them.
Note: In case of Infrastructure as Code CloudKnit provides an ability to see the plan & get manual approval before running the IaC to make sure it doesn't destroy any resources you don't want to, especially in Production environments.
You might want to teardown environments when they are not used to save costs. CloudKnit provides a single line change using flag
teardown in the Environment YAML. Once
teardown flag is set to true and definition is pushed to Source Control, CloudKnit picks up the change and tears the environment down by destroying individual components in the correct order.
Environment Visibility & Workflow
CloudKnit also provides visibility into your environments and an optimal GitOps workflow with useful information on the UI like estimated costs/status etc. Check diagram 4 below for an example environment in CloudKnit UI.
We hope that by open-sourcing CloudKnit early, we can form a close-knit open-source community around it to make managing cloud environments easy.
Components: A logical grouping of 1 or more Infrastructure Resources or Applications that get provisioned together. For example, Networking is an Infrastructure Component with various Infrastructure resources like Virtual Private Cloud(VPC), Subnets, Internet Gateways, Route Tables, etc.
Environment: A logical grouping of all the Components needed to run business applications. The grouping includes components like networking, eks, database, k8s apps, etc.