cndi
Start with a Template for a popular service and CNDI will help you deploy it on your own infrastructure, just as easily as you can sign up for a Platform as a Service.
Once your cluster is set up, manage the infrastructure and applications with ease using GitOps and Infrastructure as Code.
demo video π₯
If you'd like to see a walkthrough for setting up an Airflow cluster using CNDI, checkout this demo:

installation π₯
To install CNDI we just need to download the binary and add it to our PATH. This can be done using the script below:
curl -fsSL https://raw.githubusercontent.com/polyseam/cndi/main/install.sh | sh
If you'd prefer to use Windows we have a one-liner for that too described in this short guide.
In either case once the script has finished running, the cndi
binary and
dependencies are installed to ~/.cndi/bin
.
usage
CNDI is a tool with which to deploy GitOps enabled Kubernetes application clusters on any platform as quickly and easily as possible. The best way to understand this process is to look at it as a lifecycle.
lifecycle: init π±
The first step in the lifecycle is to initialize the CNDI project. Because CNDI's mechanics are based on the GitOps workflow, we should initialize a Git repository before we do anything else. The best way to do this as a GitHub user is to use the gh cli.
gh repo create cndi-example --private --clone && cd cndi-example
Now that we have a Git repository, we can initialize a new CNDI project.
We can do this in 2 different ways, either by using the interactive CLI, or by
writing or forking a "cndi-config" file you got from someone else, named
cndi-config.yaml
or cndi-config.jsonc
.
interactive mode
The best way to get started if you are new to CNDI is to use the interactive cli, so let's look at that first.
# once cndi is in your "PATH" you can run it from anywhere
cndi init --interactive
This will start an interactive cli that will ask you a series of questions, the first is to select a Template. Templates are a CNDI concept, and they can be thought of as a "blueprint" for a data stack. Once you select a Template, CNDI will ask you some general questions about your project and some template-specific questions. Then it will write out a few files inside your project repo.
non-interactive mode
The other way to initialize a CNDI project is to use a CNDI Template file. This
is done by calling cndi init --template <template_location>
. When specifying a
Template location you can choose to use Templates the CNDI team have built that
are found in this repo at src/templates. To do this you pass in
a Template "name". For example if you wanted to run the Airflow Template on EC2
you would run:
For example if you wanted to run the Airflow Template on EC2 you would run:
cndi init --template ec2/airflow
Alternatively, you can also specify a Template URL. A Template URL resolves to a CNDI Template file, this means that you are not limited to only Templates the CNDI team has put together, you can point to any arbitrary template file that follows the Template Schema.
These Template URLs can be file://
URLs which have an absolute path to the
file locally, or typical remote or https://
URLs over the net.
# file:// URLs must be absolute paths
cndi init --template file:///absolute/path/to/template.yaml
# or
cndi init --template https://example.com/path/to/template.yaml
Whether you've chosen to use interactive mode or not, CNDI has generated a few
files and folders for us based on our cndi-config.yaml
file. If you want to
learn about what CNDI is really creating, this is the best file to look at.
We break down all of these generated files later in this document in the outputs section.
The next step for our one-time project setup is to make sure that we have all
the required envrionment variables for our project. Some of these values are
required for every deployment. For example, you always need to have
GIT_USERNAME
, GIT_PASSWORD
and GIT_REPO
.
Some are only required for certain "deployment targets" like AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
which are only needed for aws deployments. Lastly,
some are only required for certain Templates, for example all airflow
templates require GIT_SYNC_PASSWORD
for accessing repos that hold Airflow
DAGs.
These environment variables are saved to the .env
file that CNDI has generated
for us. If you didn't use interactive mode you may have some placeholders in
that file to overwrite, and they should be easy to spot. CNDI should also tell
you if it is missing expected values.
When all of the values have been set, we want to use the gh cli again, this time to push our secret environment variables to GitHub.
# cndi requires version 2.23.0 or later of the GitHub CLI
gh secret set -f .env
Now we are ready for the next phase of the lifecycle!
lifecycle: push π
Now that we have initialized our project, CNDI has given us files that describe our infrastructure resources and files that describe what to run on that infrastructure.
CNDI has also created a GitHub Action for us which is responsible for calling
cndi run
. The run
command provided in the cndi binary is responsible for
calling terraform apply
to deploy our infrastructure. To trigger the process
we just need to push our changes to repository:
git add .
git commit -m "initial commit"
git push
After cndi run
has exited successfully you should be able to see new resources
spinning up in the deployment target you selected. When the nodes come online in
that destination, they will join together to form a Kubernetes cluster.
As the nodes join the cluster automatically, they are going to begin sharing workloads. Some workloads come bundled, we will call these CNDI platform services. There are a couple such services, one is sealed-secrets, and another is ArgoCD. Sealed Secrets enables storing Kubernetes Secrets within git securely, and ArgoCD is a GitOps tool which monitors a repo for Kubernetes manifests, and applies them.
When ArgoCD comes online, it will begin reading files from the
cndi/cluster_manifests
directory in the GitHub repo we have been pushing to.
Ultimately cndi run
is only used within GitHub for infrastructure, and ArgoCD
is solely responsible for what to run on the cluster.
Your cluster will be online in no time!
lifecycle: overwrite β»οΈ
The next phase of the lifecycle is about making changes to your cluster. These
changes can be cluster_manifests
oriented, if you are making changes to the
software running on your infrastructure, or they can be infrastructure oriented
if you are horizontally or vertically scaling your cluster.
In either case, the approach is the same. Open your cndi-config.yaml
file in
your editor and make changes to your applications
, cluster_manifests
, or
infrastructure
then run:
# shorthand for cndi overwrite
cndi ow
Upon execution of the command you should see that some of the files cndi
generated for us before have been modified or supplemented with new files. So
far no changes have been made to our cluster. Just like before we need to push
the changes up for them to take effect. This is what GitOps is all about, we
don't login to our servers to make changes, we simply modify our config, and
git push
!
With these 3 phases you have everything you need to deploy a data infrastructure cluster using CNDI and evolve it over time!
lifecycle: destroy ποΈ
When it comes down time to teardown your cluster, there is only one step, just call:
cndi destroy # in your project repo, and we will take care of the rest!
This will delete all of the infrastructure resources that CNDI created for you, and from there you can choose either to delete the repo or keep it around for later.
Walkthroughs π₯Ύ
We've got a few walkthroughs you can follow if you'd like, one for each
deployment target. The walkthroughs demonstrate how to deploy a production grade
Airflow cluster using CNDI's airflow
Template.
- ec2/airflow - AWS EC2
- eks/airflow - AWS EKS
- gce/airflow - GCP Compute Engine
- gke/airflow - GCP GKE
- avm/airflow - Azure Virtual Machines
- aks/airflow - Azure AKS
- dev/airflow - Local Development
If you are interested in using CNDI, these walkthroughs will be entirely transferrable to other applications beyond Airflow.
configuration π
Let's run through the 3 parts of a cndi-config.yaml
file. This one file is the
key to understanding CNDI, and it's really pretty simple.
infrastructure.cndi ποΈ
The infrastructure
section is used to define the infrastructure that will
power our cluster. The infrastructure section is broken out into 2 distinct
categories. The first category is cndi
, and it refers to infrastructure
abstractions our team has invented that CNDI exposes for you.
Currently CNDI exposes only one abstraction, the nodes
interface, and it's a
wrapper that simplifies deploying Kubernetes cluster nodes. The CNDI nodes
interface wraps the compute resources from every deployment target we support.
All nodes
entries are nearly identical, the most substantial difference is the
kind
field which is used to specify the deployment target. These node
resources and their supporting infrastructure are ultimately provisioned by
Terraform, but we've abstracted a lot of complexity through this interface.
Declaring a node is simple, we give it a name, we give it some specs, and we add it to the array!
# you could use JSON instead, but YAML is preferred
infrastructure:
cndi:
nodes:
- name: gcp-alpha
kind: gce
role: leader # node responsible for instantiating the cluster
machine_type: n2-standard-16
- name: gcp-beta # node runs the control plane by default
kind: gce
- name: gcp-charlie
kind: gce
- name: gcp-delta
kind: gce
role: worker # node does not run the control plane
Currently we have support for dev
, AWS's ec2
and eks
, azure
and gce
clusters. More deployment targets are on the way!
Just like every other component in CNDI, nodes can be updated in our
cndi-config.jsonc
and we can call cndi ow
and push the changes to our git
remote to modify the cluster accordingly.
infrastructure.terraform π§±
The second category within infrastructure
is terraform
. This is where you
can define any Terraform resources you want to be provisioned alongside your
cluster.
infrastructure:
cndi: {}
terraform:
resource:
aws_s3_bucket:
my-bucket:
acl: public-read
bucket: s3-website-test.hashicorp.com
cors_rule:
- allowed_headers:
- "*"
allowed_methods:
- PUT
- POST
allowed_origins:
- https://s3-website-test.hashicorp.com
expose_headers:
- ETag
max_age_seconds: 3000
π‘ You can also use this section to override any of the default Terraform objects that CNDI deploys.
Generally, you should be able to customize CNDI resources through the cndi
section instead.
But, if you do need to patch a Terraform resource CNDI has created for you, you simply need to match the resource name we have used, and specify the fields you want to update.
infrastructure:
cndi: {}
terraform:
resource:
aws_vpc:
cndi_aws_vpc:
cidr_block: 10.0.0.0/24
applications π½
The next thing we need to configure is the applications that will actually run on the cluster. With CNDIv1 we focused on making it a breeze to deploy Apache Airflow in Kubernetes.
Lets see how we accomplish this here in this new and improved CNDI:
infrastructure: {}
applications:
airflow:
targetRevision: 1.7.0 # version of Helm chart to use
destinationNamespace: airflow # kubernetes namespace in which to install application
repoURL: https://airflow.apache.org
chart: airflow
# where you configure your Helm chart values.yaml
values:
dags:
gitSync:
enabled: true
repo: https://github.com/polyseam/demo-dag-bag
branch: main
wait: 70
subPath: dags
# These options are required by Airflow in this context
createUserJob:
useHelmHooks: false
migrateDatabaseJob:
useHelmHooks: false
This is built on top of ArgoCD's Application CRDs and Helm Charts. If you have a Helm Chart, CNDI can deploy it!
cluster_manifests π
The third aspect of a cndi-config
file is the cluster_manifests
object. Any
objects here will be used as Kubernetes Manifests and they'll be applied to your
cluster through ArgoCD. This gives you full access to all the Kubernetes systems
and APIs.
infrastructure: {}
applications: {}
cluster_manifests: # inside the "cluster_manifests" object you can put all of your custom Kubernetes manifests
ingress:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: minimal-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: "/"
spec: {}
If you are new to Kubernetes and are unsure what any of that meant, don't sweat it. CNDI is designed to help with that knowledge gap with templates, and you'll learn along the way too!
Pro tip! π€«
If you want to add a new Kubernetes Secret to use inside of your Kubernetes
cluster via GitOps, we make this possible by encrypting your secrets with
sealed-secrets so they can
live in your repo securely and be picked up by ArgoCD automatically. To add a
secret to your cluster add the value to your .env
file and then add a
cluster_manifest
entry like the one below. After that just call cndi ow
to
seal your secret.
The example below results in sealing the environment variables "GIT_USERNAME"
and "GIT_PASSWORD"
, into the destination secret key names
"GIT_SYNC_USERNAME"
and "GIT_SYNC_PASSWORD"
respectively.
infrastructure: {}
applications: {}
cluster_manifests:
airflow-git-credentials-secret:
apiVersion: v1
kind: Secret
metadata:
name: airflow-git-credentials
namespace: airflow
stringData:
GIT_SYNC_USERNAME: "$.cndi.secrets.seal(GIT_USERNAME)"
GIT_SYNC_PASSWORD: "$.cndi.secrets.seal(GIT_PASSWORD)"
outputs π
When cndi init
is called there are a few files that it produces:
-
a
cndi-config.yaml
- autogenerated in interactive mode only, described in the configuration section above -
a
.github/workflows
folder, with a GitHub Action inside. The workflow is mostly just wrapping thecndi run
command in the CNDI binary executable. As such, if you have a different CI system, you can execute thecndi run
command on the binary there instead. -
a
cndi/terraform
folder, containing the infrastructure resources cndi has generated for terraform, which cndi will apply automatically every timecndi run
is executed. -
a
cndi/cluster_manifests
folder, containing Kubernetes manifests that will be installed on your new cluster when it is up and running. This includes manifests likeIngress
from thecluster_manifests
section of yourcndi-config.jsonc
. -
a
cndi/cluster_manifests/applications
folder, which contains a folder for each application defined in theapplications
section of yourcndi-config.yaml
, and a generated ArgoCD Application CRD inside that contains our expertly chosen defaults for that App, and the spefic parameters you've specified yourself in theapplications
section of yourcndi-config.yaml
. -
a
.env
file which contains all of your environment variables that CNDI relies on, these values must be environment variables that are defined and valid whencndi run
is executed. -
a
.gitginore
file to ensure secret values never get published as source files to your repo -
a
./README.md
file that explains how you can use and modify these files yourself for the lifetime of the cluster
up and running
logging into ArgoCD π
ArgoCD's Web UI is a useful tool for visualizing and debugging your cluster
resources. Some of our templates setup Ingress for ArgoCD automatically, if you
don't have an Ingress you can still access it by following our
port-forwarding doc. Once you can see the login
screen for ArgoCD you can login with the username admin
and the password we
set for you in your .env
file under the key ARGOCD_ADMIN_PASSWORD
.
dns π
Setting up DNS for your cluster is a critical step if your cluster will be
served online. The solution depends on your "deployment target". We have a doc
coming soon walking through setting up DNS for AWS and GCP coming soon, but in
short you just need to point DNS to the load balancer we provisioned for you. In
AWS this means using a CNAME
record, or an A
record for a cluster running on
GCP or Azure.
building cndi (contributor guide) π οΈ
If you are hoping to contribute to this project and want to learn the ropes, you are in the right place. Let's start with setting up your environment:
setup π¦
The first step as you might expect is to clone this repo. Take note of where you clone to, it will matter later when we setup some convenience aliases.
1. Clone Repo:
git clone https://github.com/polyseam/cndi
2. Install Deno:
Next let's install deno, though it can be installed with a package manager, I would recommend that you install it without one. Once Deno is installed, make sure you add it to your PATH.
3. Setup cndi Alias: Let's setup an alias that allows us to use the deno
source code as if it were the regular CLI, without colliding with the released
cndi
binary
# make sure the path below is correct, pointing to the main.ts file in the repo
alias cndi-next="deno run -A ~/dev/polyseam/cndi/main.ts"
We're continuously improving CNDI, but if you have an issue, checkout frequently-asked-questions to get unblocked quickly.
If you have any other issues or questions please message Matt or Tamika in the Polyseam Discord Chat.