Ballast manages kubernetes node pools to give you the cost of preemptible nodes with the confidence of on demand nodes.
There are 3 steps to deploy the ballast-operator:
- Create a GCP Service Account
- Create a Kubernetes Secret w/ the GCP Service Account Keys
- Deploy the operator
The ballast Deployment
will need to run as a GCP service account with access to your clusters' node pools.
The following script will create a GCP service account with permissions to view and manage cluster pool sizes.
export GCP_PROJECT=my-project-id
export SERVICE_ACCOUNT=ballast-operator
gcloud iam service-accounts create ${SERVICE_ACCOUNT}
gcloud projects add-iam-policy-binding ${GCP_PROJECT} \
--member serviceAccount:${SERVICE_ACCOUNT}@${GCP_PROJECT}.iam.gserviceaccount.com \
--role roles/container.admin
gcloud projects add-iam-policy-binding ${GCP_PROJECT} \
--member serviceAccount:${SERVICE_ACCOUNT}@${GCP_PROJECT}.iam.gserviceaccount.com \
--role roles/compute.viewer
Note: ballast only needs a few permissions. Security minded users may prefer to create a custom role with the following permissions instead:
- container.clusters.get
- container.clusters.update
- compute.instanceGroups.get
The following script will create a secret named ballast-operator-sa-keys
that contains the GCP service account JSON keys.
gcloud iam service-accounts keys create /tmp/ballast-keys.json \
--iam-account ${SERVICE_ACCOUNT}@${GCP_PROJECT}.iam.gserviceaccount.com
kubectl create secret generic ballast-operator-sa-keys --from-file=gcp.json=/tmp/ballast-keys.json
rm /tmp/ballast-keys.json
A kustomization base
is included that deploys:
ClusterRole
ClusterRoleBinding
CustomResourceDefinition
Deployment
PodDisruptionBudget
ServiceAccount
Service
The kustomization file expects secret/ballast-operator-sa-keys
(created above) to exist in the same namespace the operator is deployed in.
kubectl apply -k ./manifests/base/
The operator exposes prometheus metrics on port 9323 at /metrics
.
BALLAST_METRICS_PORT
=9323BALLAST_DEBUG
=trueGOOGLE_APPLICATION_CREDENTIALS
=/abs/path/to/creds.json
Ballast requires that all node-pools be created in advanced. Ballast only scales managed pools' minimum count (or current size in the case autoscaling is disabled) to match the required minimums of the source pool.
apiVersion: ballast.bonny.run/v1
kind: PoolPolicy
metadata:
name: ballast-example
spec:
projectId: gcp-project-id-here
location: us-central1-a # zone that main/source pool of preemptible nodes exist in
clusterName: your-cluster-name
poolName: my-main-pool # name of the main/source pool
cooldownSeconds: 300
managedPools: # list of pools to scale relative to main pool
- poolName: pool-b
minimumInstances: 1
minimumPercent: 25
location: us-central1-a
- poolName: pool-c
minimumInstances: 5
minimumPercent: 50
location: us-central1-a
Multiple managed pools can be specified. A mix of autoscaling and fixed size pools can be used, as well as pools of different instance types/sizes.
The following steps will cause Kubernetes to prefer scheduling workloads on your preemptible nodes, but schedule workloads on your on-demand pools when it must.
- Add the label
node-group:a-good-name-for-your-node-group
to all of your node pools that will be referenced in yourPoolPolicy
. - Add the following affinity to your
Pod
,Deployment
, or other workload..
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-group
operator: In
values:
- a-good-name-for-your-node-group
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: cloud.google.com/gke-preemptible
operator: In
values:
- "true"
Ballast also supports a CRD called an EvictionPolicy
. Eviction policies allow you to specify rules for evicting pods from nodes. This can be useful for eviction pods off of unpreferred nodes effectively implementing ~preferredDuringSchedulingPreferredDuringExecution
.
The schema is:
mode
(all, unpreferred) evict off all nodes or only unpreferred nodes based onpreferredDuringSchedulingIgnoredDuringExecution
; Default: allmaxLifetime
max lifetime of a pod matchingselector
; Default: 600 secondsselector
matchLabel and matchExpressions for selecting pods to evict
apiVersion: ballast.bonny.run/v1
kind: EvictionPolicy
metadata:
name: unpreferred-nodes-nginx
spec:
mode: unpreferred
maxLifetime: 600
selector:
matchLabels:
app: nginx
matchExpressions:
- {key: tier, operator: In, values: [frontend]}
- {key: environment, operator: NotIn, values: [dev]}
Ballast is built with the bonny operator framework and Elixir.
Terraform is used to provision test clusters.
A number of make commands exist to aid in development and testing:
make help
Two test suites are provided, both require a function kubernetes server. Docker Desktop ships with a version of kubernetes to get started locally quickly.
Alternatively you can use terraform to provision a cluster on GKE with make dev.cluster.apply
. You will be charged for resources when using this approach.
First you will need to configure terraform with your GCP project and credentials:
touch ./terraform/terraform.tfvars
echo 'gcp_project = "my-project-id"' >> ./terraform/terraform.tfvars
echo 'gcp_credentials_path = "path/to/my/gcp-credentials.json"' >> ./terraform/terraform.tfvars
Now create the cluster, this can take a while:
make dev.cluster.apply
When you are done destroy the cluster with:
make dev.cluster.delete
After setting up your test cluster you'll need to deploy the operator CRDs so that the cluster has the features the test suite will exercise.
make dev.start.in-cluster
Two test suites exist:
make test
- elixir unit test suite on underlying controller codemake integration
- scales node pools on GKE
Two environment variables must be exported to run the full integration tests.
export GOOGLE_APPLICATION_CREDENTIALS=/abs/path/to/creds.json
export GCP_PROJECT=your-project-id
Additionally make lint
will run the mix code formatter, credo, and dialyzer.
You'll need a function cluster to connect to. Ballast will use your current-context
in ~/.kube/config
. This can be changed in config/dev.exs
.
GOOGLE_APPLICATION_CREDENTIALS must be set to start the application.
export GOOGLE_APPLICATION_CREDENTIALS=/abs/path/to/creds.json
Then run the following to generate a development manifest, apply it to your cluster, and start iex
:
make dev.start.iex