Skip to content

Direct Containerd k1s Profile Validation

m4xx3d0ut edited this page May 13, 2026 · 1 revision

Direct Containerd k1s Profile Validation

Use this runbook when a k1s agent is working from the sibling ../k1s checkout and needs to validate k1s through WorkerBee's direct-containerd profile harness. This path exercises WorkerBee profile staging, direct containerd runtime isolation, profile ingress, workload probes, WebSocket behavior, and artifact export. It complements native k1s make profile lanes; it does not replace strict CRI, core/edge, or host-topology validation in the k1s repo.

Scope

Run from the common sibling layout:

k1s-wt/
  k1s/
  k1s-workerbee/
  k1s-workerbee.wiki/

The commands below are written for an agent shell whose current directory is ../k1s. They call the WorkerBee CLI directly with explicit direct-containerd flags. The repo helper ../k1s-workerbee/scripts/dev/wb-containerd is only a convenience wrapper around the same defaults; prefer the CLI form here so the runtime, state root, project, and cwd are visible in copied logs.

Use short explicit project names. Profile DNS names, container labels, and containerd namespaces include the project name, so keep each generated project around 12 characters or fewer. The startup sweep can share one project, but the workload gates should use isolated projects so stale app state from one profile does not affect the next profile.

Preflight

From the k1s checkout:

cd /home/m4xx3d0ut/git/k1s-wt/k1s
export K1S_ROOT="$PWD"
export WORKERBEE_ROOT="$(cd ../k1s-workerbee && pwd)"
export WB="${WORKERBEE_BIN:-$WORKERBEE_ROOT/.venv/bin/workerbee}"
export WB_STATE_ROOT="${WORKERBEE_CONTAINERD_STATE_ROOT:-/tmp/workerbee-containerd-verify}"
export WB_RUN_ID="${WB_RUN_ID:-$(date -u +%H%M%S)}"
export WB_PROJECT_BASE="${WB_PROJECT_BASE:-${WB_PROJECT:-k1spv$WB_RUN_ID}}"
export WB_PROJECT="${WB_PROJECT:-$WB_PROJECT_BASE}"
export WB_SINGLE_PROJECT="${WB_SINGLE_PROJECT:-${WB_PROJECT_BASE}s}"
export WB_HA_PROJECT="${WB_HA_PROJECT:-${WB_PROJECT_BASE}h}"
export RUN_DIR="/tmp/workerbee-k1s-profile-validation-$(date -u +%Y%m%dT%H%M%SZ)"
mkdir -p "$RUN_DIR"

wbcd_project() {
  local project="$1"
  shift

  "$WB" \
    --runtime containerd \
    --containerd-privilege sudo-helper \
    --state-root "$WB_STATE_ROOT" \
    --project "$project" \
    --cwd "$K1S_ROOT" \
    --json \
    "$@"
}

wbcd() {
  wbcd_project "$WB_PROJECT" "$@"
}

Confirm the WorkerBee CLI exists. If it does not, rebuild the WorkerBee editable environment from ../k1s-workerbee before continuing. Refresh sudo if the agent shell can do so noninteractively; an existing responsive sudo-helper is also acceptable and is checked below.

test -x "$WB"
sudo -n -v || true

Start or verify the direct-containerd MCP daemon. The workload validation path requires the daemon because profile dashboard/API/app ingress is owned by the global WorkerBee Caddy edge.

wbcd mcp status --host 127.0.0.1 --port 8765 | tee "$RUN_DIR/mcp-status-before.json"
if ! jq -e '.running and .runtime == "containerd" and (.global_dashboard.health_probe.ok // false)' \
  "$RUN_DIR/mcp-status-before.json"
then
  wbcd mcp restart --host 127.0.0.1 --port 8765 --timeout 90 | tee "$RUN_DIR/mcp-restart.json"
fi

wbcd mcp status --host 127.0.0.1 --port 8765 | tee "$RUN_DIR/mcp-status-after.json"
jq -e '.running and .runtime == "containerd" and (.global_dashboard.health_probe.ok // false)' \
  "$RUN_DIR/mcp-status-after.json"

wbcd containerd-privilege status | tee "$RUN_DIR/containerd-privilege.json"
wbcd projects | tee "$RUN_DIR/projects-before.json"

Preflight acceptance:

  • MCP is running with runtime set to containerd.
  • Global ingress is running and HTTPS health is OK.
  • The Caddy CA is ready.
  • Containerd privilege reports effective_mode as sudo-helper, or the host has known-good unprivileged containerd access.
  • If sudo -n -v failed, the existing helper is still acceptable when containerd-privilege status reports effective_mode=sudo-helper and the helper probe is responsive.
  • No unrelated agent is using $WB_PROJECT, $WB_SINGLE_PROJECT, or $WB_HA_PROJECT.

Profile Inventory Gate

Record the built-in profile inventory:

wbcd profile list | tee "$RUN_DIR/profile-list.json"
jq -e '.ok and .runtime_requirement == "containerd" and (.host_k1s_processes | not)' \
  "$RUN_DIR/profile-list.json"

The expected built-in profiles are:

k1s-dev-min-sqlite
k1s-dev-etcd-labs
k1s-single-etcd-containerd
k1s-ha-min

This gate proves the profile harness is direct-containerd-only and that it will not start k1s controller, API shim, etcd, NATS, or dashboard processes on the host.

Startup Validation Sweep

Run every built-in profile through the profile startup scenario:

for profile in \
  k1s-dev-min-sqlite \
  k1s-dev-etcd-labs \
  k1s-single-etcd-containerd \
  k1s-ha-min
do
  wbcd validate \
    --scenario k1s-profile \
    --profile "$profile" \
    --k1s-root "$K1S_ROOT" \
    --timeout 240 | tee "$RUN_DIR/$profile-profile.json"

  jq -e '.ok and all(.checks[]; .ok)' "$RUN_DIR/$profile-profile.json"
done

Startup acceptance:

  • profile-start, components-running, and containerized-components pass for every profile.
  • k1s-ha-min also passes ha-leader-observed.
  • profile.status.components shows container IDs for each component.
  • The project dashboard/API routes appear under https://k1s.<project>.workerbee.localhost:19443/ and https://k1s-api.<project>.workerbee.localhost:19443/.

Use this sweep after k1s controller, API shim, state-backend, dashboard, profile entrypoint, or WorkerBee profile harness changes.

Workload Validation Gates

Run the app-engine-in-app-engine workload path on the direct workload and HA profiles. Use a separate WorkerBee project for each workload profile. A same-project sequential run can leave enough nested app state for the second apply to fail with 409 Conflict; a clean-project pass is the validation signal for the profile runtime path.

for entry in \
  "k1s-single-etcd-containerd:$WB_SINGLE_PROJECT" \
  "k1s-ha-min:$WB_HA_PROJECT"
do
  profile="${entry%%:*}"
  project="${entry#*:}"
  result="$RUN_DIR/$project-$profile-workload.json"

  wbcd_project "$project" validate \
    --scenario profile-workload \
    --profile "$profile" \
    --k1s-root "$K1S_ROOT" \
    --timeout 420 | tee "$result"

  jq -e \
    '.ok
     and all(.checks[]; .ok)
     and .websocket.ok
     and all(.probes[]; .ok)
     and all(.exports[]; .ok)' \
    "$result"
done

Workload acceptance:

  • Bundled realtime db, backend, and frontend images build.
  • WorkerBee stages the realtime-web-db native k1s bundle.
  • The bundle deploys into the running profile through the project-scoped API.
  • Workload status is OK for pods, services, deployments, jobs, and ingress.
  • Dashboard, docs, controller health, API health, and OpenAPI checks pass over WorkerBee HTTPS ingress.
  • App/API HTTPS probes pass.
  • The WebSocket probe returns the expected echo response.
  • k1s, Kubernetes, and Helm handoff exports all succeed.

This is the release-confidence gate for WorkerBee profile behavior because it exercises profile startup, image build, native k1s staging, nested deploy, WorkerBee ingress, WebSocket transport, status collection, and export. If you need to rerun one workload gate, either use a new isolated project suffix or stop and purge that workload project before retrying.

Evidence Capture

After the gates, capture the final project/profile state:

for project in "$WB_PROJECT" "$WB_SINGLE_PROJECT" "$WB_HA_PROJECT"
do
  wbcd_project "$project" profile status --k1s-root "$K1S_ROOT" \
    | tee "$RUN_DIR/$project-profile-status-after.json" || true
done

wbcd projects | tee "$RUN_DIR/projects-after.json"

Summarize the run with:

jq -r '
  "profile-list ok=\(.ok)",
  "profiles=\([.profiles[].name] | join(","))"
' "$RUN_DIR/profile-list.json"

for file in "$RUN_DIR"/*-profile.json "$RUN_DIR"/*-workload.json; do
  test -e "$file" || continue
  jq -r '
    "project=\(.project // "unknown") profile=\(.profile): ok=\(.ok) checks=\([.checks[] | "\(.name)=\(.ok)"] | join(","))"
  ' "$file"
done

Attach or reference $RUN_DIR in the k1s validation note, issue, or PR. Keep the raw JSON files because they include exact profile names, routes, probe results, exported artifact paths, and WorkerBee state roots.

Cleanup

Stop the active profile without deleting captured state:

for project in "$WB_PROJECT" "$WB_SINGLE_PROJECT" "$WB_HA_PROJECT"
do
  wbcd_project "$project" profile stop --k1s-root "$K1S_ROOT" \
    | tee "$RUN_DIR/$project-profile-stop.json" || true
done

Only purge when the project is disposable and no other agent is using it:

for project in "$WB_PROJECT" "$WB_SINGLE_PROJECT" "$WB_HA_PROJECT"
do
  wbcd_project "$project" profile stop --k1s-root "$K1S_ROOT" --purge \
    | tee "$RUN_DIR/$project-profile-purge.json" || true
done

Inspect broader stale resources before deleting anything outside the project:

wbcd cleanup | tee "$RUN_DIR/cleanup-dry-run.json"

Use cleanup --execute only after confirming the dry-run output targets WorkerBee-owned namespaces, networks, and containers under the selected state root.

Failure Triage

If MCP or ingress is not ready:

wbcd mcp status --host 127.0.0.1 --port 8765 | tee "$RUN_DIR/mcp-status-failed.json"
wbcd ingress status | tee "$RUN_DIR/ingress-status-failed.json"

If direct containerd access fails:

wbcd containerd-privilege status | tee "$RUN_DIR/containerd-privilege-failed.json"

If the helper is not responsive, refresh sudo in an interactive shell if available and restart MCP with the same direct-containerd flags before retrying:

sudo -v
wbcd mcp restart --host 127.0.0.1 --port 8765 --timeout 90 | tee "$RUN_DIR/mcp-restart-retry.json"

In noninteractive shells, sudo -n -v may fail even when an existing sudo-helper is still usable. Trust the effective helper status and probe result from containerd-privilege status over the sudo refresh attempt. If MCP status shows saved daemon argv that mentions unprivileged while containerd-privilege status reports effective_mode=sudo-helper, treat the effective status as the runtime signal and restart MCP explicitly if the mismatch blocks triage.

If a workload gate fails with 409 Conflict during native k1s apply, rerun that same profile in a fresh WorkerBee project before treating it as a k1s runtime regression:

export WB_HA_PROJECT="k1svha"
wbcd_project "$WB_HA_PROJECT" validate \
  --scenario profile-workload \
  --profile k1s-ha-min \
  --k1s-root "$K1S_ROOT" \
  --timeout 420 | tee "$RUN_DIR/$WB_HA_PROJECT-k1s-ha-min-workload-retry.json"

The expected signal is a clean-project pass. A same-project conflict is a validation-flow idempotency issue unless the same profile also fails from clean state.

If a profile is degraded, capture status before stopping it:

fail_project="${project:-$WB_PROJECT}"
wbcd_project "$fail_project" profile status --k1s-root "$K1S_ROOT" \
  | tee "$RUN_DIR/$fail_project-profile-status-failed.json"
wbcd projects | tee "$RUN_DIR/projects-failed.json"

Profile workload logs are exposed through MCP as workerbee_v1_logs(target="profile", profile=<profile>, app=<app>); use that tool when the agent has MCP access.

If a stale profile holds ports or routes, stop the exact project first. Use --purge only for the short explicit projects used by this run.

MCP Tool Equivalent

Agents with WorkerBee MCP tools wired can run the same ladder through MCP:

  • Start with workerbee_v1_session_start for the ../k1s cwd and use a short explicit project for profile work.
  • Check direct-containerd readiness with workerbee_v1_ingress_status and workerbee_v1_profile_list.
  • Run startup gates with workerbee_v1_profile_validate for each built-in profile.
  • Run release-confidence gates with workerbee_v1_profile_workload_validate on k1s-single-etcd-containerd and k1s-ha-min, using a separate project for each workload profile.
  • Inspect nested workload state with workerbee_v1_profile_workload_status, logs with workerbee_v1_logs(target="profile"), and HTTPS routes with workerbee_v1_ingress_probe.

The CLI runbook remains the canonical copy/paste path for a k1s agent shell because it makes the selected WorkerBee binary, state root, project, cwd, and JSON artifacts explicit.

Clone this wiki locally