There are two main requirements when deploying with Kuby: an environment which can Docker Build, and Ruby must be installed in order to run kuby
.
Kuby itself runs Docker and within Docker must install Kuby as well. So there are two places in the CI process where gems will be installed: inside and outside of the Docker image.
If both cannot be cached effectively, then many minutes will be wasted every build re-installing Ruby gems and re-compiling native extensions every build. Correct configuration of caching avoids this waste that slows our build and does not add value.
tl;dr
copy .github/workflows/kuby.yml
This repo is a basic rails app that should serve as a proving ground for implementing the advice that follows. If you have a kuby.rb
file already and have installed the getkuby/kuby
gem in your Rails app, then you can copy this file into the same path in your git repository and adjust the parameters like Ruby version for your own environment.
Unless you have everything configured already, there are some steps to make this work on a new repository.
How do I run this workflow on a fresh repo?
The ruby/setup-ruby@v1
action installs Ruby and provides caching of the bundle artifacts (outside of the Docker image.)
- uses: ruby/setup-ruby@v1
with:
ruby-version: 2.7.5
bundler-cache: true
cache-version: 1
You can increment cache-version
to destroy the cache when and if it ever gets corrupted. This step runs bundle install
.
No configuration is needed; the GitHub Actions caching service is underneath.
Kuby uses Docker to build and push container images to an image registry.
This section of kuby.rb
tells Kuby where our images are stored, and what creds can be used to push or pull them:
Kuby.define 'KubyTest' do
environment :production do
# ...
docker do
credentials do
username app_creds[:KUBY_DOCKER_USERNAME]
password ENV['GITHUB_TOKEN']
email app_creds[:KUBY_DOCKER_EMAIL]
end
image_url 'ghcr.io/kingdonb/kuby-tester'
end
# ...
Each commit on the main
branch results in a build and push. The destination repo is defined in kuby.rb
in the docker
block, through the image_url
method.
In credentials.yml.enc
we have stored some encrypted values in app_creds
according to the Kuby guide. We placed a value in KUBY_DOCKER_USERNAME
that is used as an imagePullSecret
, and a corresponding token in KUBY_DOCKER_PASSWORD
which should not be given write access to the package registry. This is for secure image pull access only, (and could be omitted altogether for public images.)
By using an environment variable instead of storing a PAT in the rails encrypted credentials file, we can enable using ambient credentials with Kuby to substitute a token with write:packages
only when needed. If a private repo is used, be aware that a token with read:packages
will also be needed at build-time, as Kuby assets images include a copy of the assets' prior version, so the builder needs to be able to pull from the registry.
We can add a repository secret CR_PAT
(or anything else other than GITHUB_TOKEN
) and populate it with an un-scoped Personal Access Token for write:packages
as described in GitHub Docs. This may be helpful and necessary for admin creating a new package at first, when no existing package repository already falls within the git repository's scope.
If you still need to create a package registry on ghcr.io
, you can go ahead and start by generating a Personal Access Token now.
Creation of a new GitHub package happens implicitly when the first image is pushed. So use your PAT to push any image to the registry you wish to create. If the package registry name doesn't match the source repository (or perhaps even if it does match), it may also be necessary to connect the repo to the package since GitHub won't be able to connect them implicitly.
Configure the package settings now, in case you would like to make this registry public and add write access for the Actions runner at the registry configuration page.
Then the permissions that follow as we will configure can push images through ambient credentials without storing any PAT!
jobs:
build:
permissions:
packages: write
contents: read
After properly associating our Git repo with the package, we can update our workflows as above. It may also be necessary to grant write access to workflows; review the GitHub documentation linked above for more information.
With that configuration, the ambient GITHUB_TOKEN
can be used for pushes. This mitigates a risk of compromise; since no one will need to handle a Personal Access Token with write:packages
ever again, it can be deleted now and will not be at risk any further.
It may have been unnecessary to generate a PAT in order to create a new GHCR.io package/image registry, but we can now delete it, or let it expire after our package registry is created!
- run: bundle exec kuby -e production build
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
RAILS_MASTER_KEY: ${{ secrets.RAILS_MASTER_KEY }}
- run: bundle exec kuby -e production push
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
RAILS_MASTER_KEY: ${{ secrets.RAILS_MASTER_KEY }}
GITHUB_TOKEN
is an ambient secret and is populated automatically by GitHub Actions with a Repository-scoped secret.
Populate another secret for RAILS_MASTER_KEY
in order for Kuby to build assets. Save a copy of config/master.key
securely, and destroy the original! I have simply moved mine outside of the repository path into my project workspace, on the laptop.
export RAILS_MASTER_KEY=$(cat ../kuby_test-master.key)
Kuby always expects to find a RAILS_MASTER_KEY
in the environment at kuby build
-time.
Depending on your app initializers other access could also be needed.
We should also note that kuby build
is not running any tests. If your app has tests, then make another workflow and run them separately.
You must have added several variables to the Rails encrypted credentials store for Kuby during Kuby setup:
secret_key_base: 5b12d6fe5afdac2f910cf3d316ea1bc9f6d779f4950f8333e2c7a6a6b85b67dbb9665deb3f380f881f0ebdc2de9acb371efb2c5caa863d8c359eded864d4e547
# Rails has randomly generated a secret_key_base, above. Run `rails credentials:edit` and provide your own values for the variables below:
KUBY_DOCKER_USERNAME: kingdonb
KUBY_DOCKER_PASSWORD: ghp_XXXinvalid12345abcdefghijklmnopqrstu
KUBY_DOCKER_EMAIL: example@example.com
KUBY_DB_USER: kuby_test
KUBY_DB_PASSWORD: Oosh0sadooz5osh@eir2Ioj
KUBY_DIGITALOCEAN_ACCESS_TOKEN: XXXinvalidHuferiuKe4Yexoo9nohngaen3aiZieQuecoh6quai2ielae8ongoob
KUBY_DIGITALOCEAN_CLUSTER_ID: 8704193d-a88c-41b9-b9c0-cd290774d34e
Depending on your choice of Kubernetes hosting provider, you may or may not have to include any access tokens or database passwords. The remaining details of this configuration are out of scope for this document.
The docker password will be used as an imagePullSecret
in the default configuration from Kuby. Our configuration doesn't use this value at all. Setting up imagePullSecrets
on your manifests may be necessary for many registries simply to prevent rate limiting. Kuby generates a dockerconfigjson
secret, based on your configuration from the docker
.credentials
block referenced before.
The kuby build
or kuby -e production build
step can complete in a few seconds if caches are perfect and no assets have changed. If assets must be precompiled or recompiled, then that would be the only layer that should need to be recompiled. Kuby creates a complex Dockerfile in-memory to decide this.
With an ideal caching configuration, builds that do not update Gemfile
or Gemfile.lock
should be possible to finish in well under 2 minutes. (Precise benchmarks were not available at press time because my workstation is an M1 Macbook Pro, so everything runs in x86_64
emulation and is much slower than on GitHub. The caching configuration at press time was also not ideal.)
This option should be very easy to achieve if you're already using a self-hosted runner with the same architecture as your cloud. Simply arrange for the workloads to run on the same host every time, and there will be no requirement for any caches to be transported from one build node to the next.
Problem with this idea: maybe that node lives forever on the public cloud somewhere, and that costs a lot. GitHub Actions is free for public repos, but self-hosted runners take up physical real-estate and cost electricity to host, which is all not free. So this is not an actual free solution.
Ideally, our caching solution would not require for builds to use a self-hosted runner, or to always land each build for our project on the same node every time. Configuring a self-hosted runner is therefore discounted as a solution, and further exploration is beyond the scope of this document.
If Kuby
can run builds through docker buildx build
, then certain other options become available.
Look to Flux's image reflector controller for inspiration. This approach also uses actions/cache@v1
.
This option has not been fully explored yet, but any project without permanently hosted resources of their own can follow Flux's example.
One notable difference that users likely will encounter if they are trying to maintain a local cache between builds, is the cache scope must be unique for each image built. I was not able to adapt this strategy before giving up and looking for other options, but I think there isn't any reason it should not work just fine.
Running builds are a job of the project infrastructure, and they can use the local cache when nodes are reused. GitHub will sometimes reuse our nodes, and we can opportunistically take advantage of that event when it does occur. But can we keep an instance of the buildkit builder around for longer than the lifetime of a build job?
Expiring or invalidation of the cache will cause some builds to take longer. This is not a disaster, but should be considered as necessary and expected; builds will always take up some time, and no matter how well our cache strategies work, we will rarely if ever find them running at 100% hit ratios.
This seems to be the best option.
- name: Expose GitHub Runtime
uses: crazy-max/ghaction-github-runtime@v1
- name: Kuby build (and push) app image
run: bundle exec kuby -e production build --only app -- \
--cache-from=type=gha,scope=app \
--cache-to=type=gha,scope=app,mode=max \
--push
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
RAILS_MASTER_KEY: ${{ secrets.RAILS_MASTER_KEY }}
- name: Kuby build (and push) assets image
run: bundle exec kuby -e production build --only assets -- \
--cache-from=type=gha,scope=assets \
--cache-to=type=gha,scope=assets,mode=max \
--push
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
RAILS_MASTER_KEY: ${{ secrets.RAILS_MASTER_KEY }}
This uses the GitHub Actions Caching API directly, which does not assume that builds can carry anything with them into their next life.
Only a hosted runner could be faster, if it kept around a builder which kept local caches and persisted them locally, such that for cache hits accessing the cache service across GitHub's high-speed local network link to download image layers would not even be necessary.
With caching, builds can be made very fast. Hosted runners could be even faster. However, as the next option will show, rebuilding a new container image from scratch might not always be necessary. There are some great options here for Ruby devs.
If waiting 30 seconds between each push for CI to build and Kubernetes to deploy is still too long for your dev team's expectations or sensibilities, go ahead and build on something like Okteto where your changes can be synced directly into a running pod.
There you can run in development
mode, taking advantage of every code hot-reloading feature that your language or chosen frameworks can provide! The Okteto CLI is completely open source and free for developers to use, and it can run dev pods on any Kubernetes cluster.
When implemented well, this approach can even help some members of your team participate in Ruby development while still avoiding a need to install or use Ruby locally. No one should let friends run bundle install
on a laptop, (that's a job for robots in the cloud!)
We can spend some time to ensure we don't commit waste in our CI builds. But before spending too much time optimizing caches for every niche, we can look for solutions that help us to build less often.
If we can spend two minutes on a waste but "only this once," so that running a full build is not always a requirement of testing a change, then we will be able to run our sometimes-wasteful builds even less often! Now, let's try it out.
What steps are necessary if we are tl;dr
and want to YOLO through the instructions without reading any of the backstory or character development in previous paragraphs?
TODO