Skip to content

Reduce bootstrap k8s cluster to single node (k3d agents=0) #506

@rowan-stein

Description

@rowan-stein

User request

Reduce Kubernetes node count in bootstrap to a single node to see if it reduces provisioning time and overall resource consumption.

Specification (from research)

Bootstrap provisions a local k3d/k3s cluster via Terraform stack stacks/k8s (provider agynio/k3d). Current defaults are:

  • servers = 1
  • agents = 2

Target single-node Kubernetes cluster:

  • servers = 1
  • agents = 0

Required changes

  1. Default node counts

    • In stacks/k8s/variables.tf, change agents default from 2 to 0.
    • Add validations:
      • servers >= 1
      • agents >= 0
  2. Make node count explicit in bootstrap apply flow

    • In apply.sh, for the k8s stack, pass -var servers=... and -var agents=... using environment variables:
      • K3D_SERVERS (default 1)
      • K3D_AGENTS (default 0)
    • Add basic integer validation for these env vars (similar to existing PORT validation).
  3. Expose node count as action inputs (CI)

    • In .github/actions/provision/action.yml, add inputs:
      • servers default "1"
      • agents default "0"
    • Wire them to the apply.sh step via env vars (K3D_SERVERS, K3D_AGENTS).
  4. Verification

    • Add a post-provision assertion (in existing health verification step or a small new script step):
      • kubectl get nodes --no-headers | wc -l must equal 1.
    • Also log node taints for debugging if scheduling fails.
  5. Metrics for comparison (time/resources)

    • Ensure CI logs include (best-effort):
      • apply.sh timing summary (already present)
      • docker system df and df -h before/after provisioning
      • kubectl get nodes -o wide and kubectl get pods -A -o wide
    • These metrics should enable baseline comparison between agents=2 and agents=0 runs.

Acceptance criteria

  • Default bootstrap provisions exactly 1 Kubernetes node.
  • Existing platform verification (e.g., health checks) still passes.
  • CI remains green.
  • Logs (or artifacts, if already used) include enough timing/resource data to compare against the prior 3-node topology.

Notes / constraints

  • k3d load balancer container will still exist; success is 1 Kubernetes node, not 1 Docker container.
  • This is intended for bootstrap/CI/local usage (not HA).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions