Skip to content

devex: Setup local development loop for Kubernetes driver #1016

@TaylorMutch

Description

@TaylorMutch

Problem Statement

Contributors to OpenShell targeting Kubernetes deployment should have a simple and easy to follow development cycle that is supported by the OpenShell project maintainers.

Proposed Design

Overview

The development loop is built around k3s as the local cluster, k3d as the macOS bootstrap mechanism to run k3s in a container, Skaffold for image build and sync automation, and new mise tasks that wire everything together. Two development modes are supported:

  • Out-of-cluster — the gateway binary runs on the host with KUBECONFIG pointing to the local k3s cluster. Fast iteration; no image rebuild needed for gateway changes.
  • In-cluster — gateway runs as a pod inside k3s, deployed via the existing Helm chart. Skaffold watches for source changes, rebuilds images, and triggers rolling restarts.

Both modes share the same local cluster and local image registry, so they can be freely combined.


Local Cluster: k3s via k3d

k3s is the target Kubernetes distribution — it is what production and the existing openshell-bootstrap cluster use. On macOS, k3s cannot run natively (it is a Linux process), so k3d is used to run k3s inside a container. k3d is a thin wrapper: it creates a k3s cluster as a set of containers on the local machine, exposes a kubeconfig, and provides a built-in local image registry. On Linux, k3s can be installed and run directly without k3d.

k3d supports both Docker and Podman as the underlying container engine. The dev tasks detect the active engine (using the same container-engine.sh detection the project already uses) and set DOCKER_HOST accordingly before invoking k3d:

Engine DOCKER_HOST value
Docker unset (default socket)
Podman on macOS unix://$HOME/.local/share/containers/podman/machine/podman.sock
Podman on Linux unix://$XDG_RUNTIME_DIR/podman/podman.sock (or /run/user/{uid}/podman/podman.sock)

This matches the socket path logic already established in crates/openshell-driver-podman/src/config.rs:116–128. Podman users on macOS must have podman machine start running — the same prerequisite required by the existing Podman compute driver.

The dev cluster is created with a bundled local registry:

k3d cluster create openshell-dev \
  --registry-create openshell-registry:5000 \
  --port "8080:30051@loadbalancer" \
  --agents 1

This produces:

  • A kubeconfig merged into ~/.kube/config (or written to ~/.kube/k3d-openshell-dev.yaml)
  • A local OCI registry at localhost:5000 reachable from inside k3s pods as openshell-registry:5000
  • A host port mapping on 8080 matching the existing Helm service config

Mise Tooling

Tool declarations added to mise.toml:

[tools]
"aqua:k3d"      = "latest"   # k3s-in-container cluster management (macOS)
"aqua:skaffold" = "latest"   # image build + deploy automation
"aqua:kubectl"  = "latest"   # cluster interaction
"aqua:helm"     = "latest"   # already used by deploy/helm

New tasks in tasks/k8s-dev.toml:

Task Description
mise run k8s:dev:up Create k3d cluster + registry if not already running; sets DOCKER_HOST for Podman users
mise run k8s:dev:down Tear down k3d cluster and registry
mise run k8s:dev:bootstrap Apply CRDs and deploy the Helm chart with values.dev.yaml overrides
mise run k8s:dev:gateway Run gateway out-of-cluster with KUBECONFIG and --drivers kubernetes
mise run k8s:dev:watch Run skaffold dev for in-cluster hot-reload loop
mise run k8s:dev:sync One-shot: build + push gateway and supervisor images to the local registry

k8s:dev:up and k8s:dev:bootstrap are idempotent prerequisites for both development modes.

All tasks that invoke k3d or container operations source tasks/scripts/container-engine.sh to reuse the existing engine detection, then derive and export DOCKER_HOST before any k3d calls.


Out-of-Cluster Development

The Kubernetes driver already falls back to kube::Config::infer() when not running inside a pod (crates/openshell-driver-kubernetes/src/driver.rs:119–145), which reads from KUBECONFIG. No driver changes are required for this mode.

mise run k8s:dev:gateway sets the necessary environment and starts the gateway:

export KUBECONFIG=~/.kube/k3d-openshell-dev.yaml
export OPENSHELL_DRIVERS=kubernetes
export OPENSHELL_SANDBOX_NAMESPACE=openshell
export OPENSHELL_SANDBOX_IMAGE=openshell-registry:5000/openshell-supervisor:dev
cargo run -p openshell-server -- ...

Sandbox pods are created inside k3s; the gateway watches them via the Kubernetes API. The supervisor binary sideload path from driver.rs:657–761 continues to work — kubectl cp replaces docker cp for pushing an updated binary into the k3s node.


In-Cluster Development with Skaffold

skaffold.yaml at the repo root defines two build artifacts:

apiVersion: skaffold/v4beta11
kind: Config
build:
  local:
    push: true
  artifacts:
    - image: openshell-registry:5000/openshell-gateway
      docker:
        dockerfile: deploy/docker/Dockerfile.gateway
    - image: openshell-registry:5000/openshell-supervisor
      docker:
        dockerfile: deploy/docker/Dockerfile.supervisor
deploy:
  helm:
    releases:
      - name: openshell
        chartPath: deploy/helm/openshell
        valuesFiles:
          - deploy/helm/openshell/values.dev.yaml

values.dev.yaml overrides the image registry to openshell-registry:5000 and sets imagePullPolicy: Always so k3s always pulls from the local registry on pod restart.

mise run k8s:dev:watch runs skaffold dev, which:

  1. Builds changed images and pushes them to localhost:5000 (the k3d registry)
  2. Runs helm upgrade to roll out updated images
  3. Streams pod logs to the terminal
  4. Tears down on Ctrl+C

For fast iteration on the supervisor binary alone, Skaffold file sync can be configured to copy the compiled binary directly into a running pod without a full image rebuild.


Development Mode Summary

# One-time setup (idempotent)
mise run k8s:dev:up         # start k3s cluster via k3d (Docker or Podman)
mise run k8s:dev:bootstrap  # install Helm chart + CRDs into cluster

# Out-of-cluster (fastest gateway iteration)
mise run k8s:dev:gateway
  └─ KUBECONFIG=~/.kube/k3d-openshell-dev.yaml cargo run -p openshell-server

# In-cluster (full stack validation)
mise run k8s:dev:watch
  └─ skaffold dev --kubeconfig ~/.kube/k3d-openshell-dev.yaml

Alternatives Considered

k3d vs kind vs minikube for macOS:
k3d runs k3s specifically, keeping the local cluster consistent with the production runtime and openshell-bootstrap. kind runs upstream Kubernetes, which is a wider divergence from production. minikube adds a VM layer and is heavier to manage. k3d also has the best support for embedded local registries with minimal configuration.

Skaffold vs Tilt:
Both are strong choices. Tilt offers a web-based dev UI and more programmable Tiltfile configuration (Starlark). Skaffold is chosen here because it is purely terminal-driven (consistent with the project's CLI and mise task focus), has first-class Helm support, and produces CI-reproducible builds via skaffold run. Tilt remains a viable alternative if contributors prefer its interactive UI. The skaffold.yaml is straightforward enough that a Tiltfile equivalent is low effort.

Remote shared cluster vs local cluster:
A shared remote dev cluster is an option for teams with constrained local resources. It is out of scope for this issue but the same Skaffold and mise tasks work against a remote cluster by pointing KUBECONFIG at the remote context.

Agent Investigation

  • The Kubernetes driver (crates/openshell-driver-kubernetes/src/driver.rs:119–145) already supports out-of-cluster use via kube::Config::infer() — no driver changes are needed.
  • The supervisor binary sideload via hostPath volume (driver.rs:657–761) works unchanged with k3d nodes; kubectl cp is the k3d equivalent of docker cp.
  • The project already has a mature container engine abstraction in tasks/scripts/container-engine.sh that detects Docker vs Podman and normalizes the interface. k3d tasks should source this and derive DOCKER_HOST before invoking k3d.
  • Podman machine socket paths are already resolved in crates/openshell-driver-podman/src/config.rs:116–128 — the same logic applies to setting DOCKER_HOST for k3d.
  • The Helm chart at deploy/helm/openshell/values.yaml is parameterized for image registry and pull policy overrides; a values.dev.yaml is the minimal integration surface.
  • No Skaffold, Tilt, or docker-compose config exists today — this is a greenfield addition.
  • Existing mise cluster tasks in tasks/cluster.toml and image build tasks in tasks/docker.toml provide good structural patterns for the new tasks/k8s-dev.toml.

Checklist

  • I've reviewed existing issues and the architecture docs
  • This is a design proposal, not a "please build this" request

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions