Problem: we offer a set features that allow users to run EKS-A cluster in disconnected environments, but they lack cohesion.
Overtime, we have developed a particular set of features and tools that, even though they were aimed to solve independent problems, they all converge to offer to support for running EKS-A in environments without Internet connection. As a result of their independent development, their interfaces and behaviors are not in sync, which makes the user experience not ideal.
This document aims to solve that, learning from the experience gained building such tools and offering a more cohesive and simple to use solution, without drastically changing the backbone of those features and how they interact with each other.
- Simple: minimal interface, minimum number of manual steps.
As an EKS-A cluster administrator I want to:
- Manage clusters in an environment without Internet connection.
- Populate my environment with the necessary dependencies without having to worry about what those dependencies are and with minimal effort.
- Interact with EKS-A clusters in air-gapped environments in the same way I interact with any other EKS-A cluster.
In scope
- Download dependencies and package them.
- Populate disconnected environments with dependencies without Internet connection.
Not in scope
- Setup additional infrastructure to support disconnected environments (eg. private registries).
Future scope
- Package dependencies selectively.
- Populate disconnected environments with dependencies with Internet connection.
In order to support disconnected environments, users currently have to:
- Provision an OCI registry.
- Run
eksctl anywhere download artifacts
to download manifests to disk. - Run
eksctl anywhere download images
to download images and charts to disk. - Run
eksctl anywhere import images
to push container images and charts to a registry. - Configure their cluster spec file to point to the registry
- Run all cluster commands (
create
,upgrade
, etc.) with the--bundles-override
flag pointing to theBundles
manifest downloaded in the first step.
- Users will provide an OCI registry.
- We will store all dependencies (container images, helm charts and yaml manifests) in that registry.
- If a cluster config with a
registryMirrorConfiguration
is provided, the CLI and the rest of components will always pull them from it. No extra flags/configuration will be needed and the user will interact with this cluster as with any other one.
In order to populate the registry, we will offer two commands:
eksctl anywhere export artifacts --output artifacts.tar
eksctl anywhere import artifacts --input artifacts.tar --registry myregistry.com
The first one downloads all dependencies (yaml manifests, images and helm charts). The only command argument here is the destination file. This command will create a tarball containing all 3 types of artifacts in that location.
The second one unpackages the tarball created in the first command, reads the packaged Bundles
manifest and imports the referenced dependencies to a registry. The input here will be the artifacts tarball, the registry endpoint and the registry credentials (provided through env vars).
This two commands can be expanded in the future to add more capabilities like selectively exporting dependencies (eg. only for one kubernetes version or for only one provider).
A specific version of the CLI tied to one Bundles
manifest will always produce the exact same artifacts tarball.
This means that we can prepackage the dependencies, store them in a public bucket and reference the tarball in the Release
manifest.
This simplifies the experience for users who are interested in the default dependencies bundle.
The proposed design can be implemented in 4 incremental phases:
- Add the two new commands but only push images and charts to the registry. This will require to keep using the
--bundles-override
pointing to aBundles
manifest in disk. This is an incremental improvement over the current state ofdownload images
so work should be minimal. - Push manifests to OCI registry and add the capability to the CLI to download them.
- Start storing our default manifests in public ECR as opposed to Cloudfront + S3. This unifies even more the behavior of connected and disconnected environments.
- Start packaging dependencies and serving them from the
Release
manifest.
This proposal deprecates several commands:
download artifacts
download images
import-images
import images
We should inform users in the next release notes and keep the old commands in the codebase for at least one more release, printing a warning when executed.
We should add at least one E2E test for the whole flow:
- Download artifacts to disk
- Import artifacts to a private registry
- Create a cluster with
registryMirrorConfiguration
(ideally in an environment without external Internet connection).