Skip to content

feat(helm): add pod extensibility controls#2

Closed
thekkagent wants to merge 7 commits intomainfrom
feat/helm-chart-extensibility
Closed

feat(helm): add pod extensibility controls#2
thekkagent wants to merge 7 commits intomainfrom
feat/helm-chart-extensibility

Conversation

@thekkagent
Copy link
Copy Markdown
Owner

@thekkagent thekkagent commented Apr 15, 2026

What problem does this solve?

OpenAB's Helm chart is currently strong on the minimal install path, but it is too thin for many real Kubernetes deployments.

It does not currently expose several common deployment controls that operators typically expect from a reusable open-source chart, including:

  • imagePullSecrets
  • probes (livenessProbe, readinessProbe, startupProbe)
  • lifecycle
  • initContainers
  • sidecars
  • extra volumes / volume mounts
  • serviceAccountName binding
  • PodDisruptionBudget
  • extra ConfigMap / Secret injection patterns
  • other pod-level deployment settings that are common in production clusters

This leaves users with a few unattractive options:

  1. rebuild a custom image for relatively minor deployment-specific needs
  2. patch rendered manifests after helm template
  3. fork the chart just to add standard Kubernetes fields

One immediate example is that the chart does not currently support imagePullSecrets, which makes private registry deployments harder than they need to be.

Closes: N/A

Discord Discussion URL: https://discordapp.com/channels/1491295327620169908/1493841502529523732

At a Glance

┌──────────────────────────────┐
│ Current OpenAB Helm Chart    │
│ minimal install path         │
└──────────────┬───────────────┘
               │
               ▼
┌──────────────────────────────┐
│ Missing common chart hooks   │
│ - imagePullSecrets           │
│ - probes / lifecycle         │
│ - initContainers / sidecars  │
│ - extra volumes / mounts     │
│ - serviceAccount             │
└──────────────┬───────────────┘
               │
               ▼
┌──────────────────────────────┐
│ This PR                      │
│ pod / deployment             │
│ extensibility only           │
└──────────────────────────────┘

Prior Art & Industry Research

OpenClaw:

I reviewed the local OpenClaw repository and its Kubernetes deployment manifests. While OpenClaw does not currently ship a Helm chart in this repo, it does treat startup bootstrap as a first-class deployment concern. In particular, it uses an initContainer to prepare configuration and workspace state before the main container starts.

This suggests:

  • initContainers are a reasonable place for startup preparation
  • bootstrap logic should be treated as deployment design, not an ad hoc workaround
  • a hardened main container can stay simpler when initialization is separated

Reference:

  • scripts/k8s/manifests/deployment.yaml

Hermes Agent:

I reviewed Hermes Agent's Docker and deployment documentation. Hermes takes a clearer stance on tool installation: stable toolchains should primarily be handled through custom images or clearly defined mutable runtime environments, not by stretching the chart into a package manager.

This suggests:

  • serious toolchains should prefer custom images
  • runtime bootstrap can exist, but it should stay lightweight and bounded
  • Helm should expose deployment extension points, not replace image design

References:

  • Dockerfile
  • website/docs/user-guide/docker.md
  • website/docs/getting-started/nix-setup.md

Other references:

I reviewed Bitnami charts and Helm / Kubernetes best practices. Bitnami's more mature charts commonly expose imagePullSecrets, serviceAccount, probes, lifecycle, initContainers, sidecars, extraVolumes, extraVolumeMounts, extraEnvVarsCM, extraEnvVarsSecret, extraDeploy, and pdb. This PR adopts the subset of that surface area that fits under Deployment.spec.template.

Relevant upstream guidance also supports this direction:

  • Helm chart values and template best practices
  • Kubernetes guidance for initContainers
  • Kubernetes guidance for private registry pulls via imagePullSecrets

Proposed Solution

This PR implements the first phase of pod / deployment extensibility for the OpenAB Helm chart, focused entirely on Deployment.spec.template.

File Purpose
charts/openab/values.yaml Add first-phase Helm values for pod / deployment extensibility
charts/openab/templates/_helpers.tpl Helper resolution for per-agent image pull policy, image pull secrets (with explicit opt-out), service account binding, and reserved-key-safe merging for pod annotations and labels
charts/openab/templates/deployment.yaml Render pod-level extensibility controls into Deployment.spec.template
charts/openab/tests/helm-template-test.sh Coverage for the new rendering behaviors, hijack-protection guard tests, and a fix for the test harness counter bug under set -e

Implemented scope:

  • imagePullSecrets (global + per-agent, with explicit opt-out)
  • per-agent imagePullPolicy (global default + per-agent override)
  • per-agent serviceAccountName binding
  • livenessProbe, readinessProbe, startupProbe, lifecycle
  • initContainers, sidecars
  • extraVolumes, extraVolumeMounts
  • podAnnotations, podLabels (global + per-agent, merged)

Explicitly out of scope:

  • PodDisruptionBudget
  • chart-managed ServiceAccount creation
  • RBAC resources
  • generic extra objects such as extraDeploy

For the "install tools" question specifically, the implementation keeps two clear paths:

  1. Custom image — the preferred path for stable, repeatable, production-grade toolchains
  2. initContainers + shared volume — a lightweight bootstrap path for small binaries or startup initialization

Safety & correctness

Two design decisions are load-bearing for correctness.

Reserved pod-metadata keys cannot be hijacked. agentPodLabels and agentPodAnnotations use mergeOverwrite(dict, global, per-agent, reserved), where the reserved key set is merged last. This guarantees:

  • checksum/config annotation is always the chart-computed hash — users cannot accidentally pin it or disable rollout-on-config-change
  • app.kubernetes.io/{name,instance,component} labels on the pod template always match spec.selector.matchLabels, so DeploymentPod selector matching can never be broken by a user-supplied podLabels entry

Without this protection, a user who set podLabels.app.kubernetes.io/name: custom would silently produce duplicate YAML keys and a Deployment whose ReplicaSet could no longer match its own Pods.

Per-agent imagePullSecrets: [] explicitly opts out of the global list. The helper uses hasKey instead of a truthy check, so three states are distinguishable per agent:

  • key omitted → inherit .Values.imagePullSecrets
  • imagePullSecrets: [] → opt out (no pull secrets, even if global is set)
  • imagePullSecrets: [foo] → replace with per-agent list

Why this approach?

I do not think OpenAB should model every Kubernetes field at once, and I do not think Helm should become the primary mechanism for packaging arbitrary tools.

At the same time, the current chart is thin enough that users are pushed toward forking it for fairly normal deployment requirements. That creates unnecessary friction for a public chart.

This approach takes the middle path:

  • keep the default chart simple
  • expose the extension points that operators commonly expect
  • keep custom images as the primary answer for serious tool installation
  • treat initContainers bootstrap as a lightweight complement, not the main packaging model
  • phase the work so the chart does not grow all at once

Tradeoffs and limitations:

  • the chart surface will grow
  • pod-metadata merge rules must be unambiguous and predictable — addressed by the reserved-last mergeOverwrite pattern described above
  • features like PDB or RBAC should stay optional to avoid over-designing the chart too early

Alternatives Considered

The main implementation question is not whether Helm extensibility should be improved, but how much should be included in the first implementation PR.

1. Do the minimum: only imagePullSecrets — rejected

This would solve the most immediate private-registry gap, but it would still leave the chart without probes, lifecycle hooks, and pod composition controls such as initContainers, sidecars, and extra volumes. The result would still feel incomplete for real deployments.

2. Implement pod / deployment extensibility only — chosen

This keeps the first implementation focused on a single surface area: Deployment.spec.template. It addresses the highest-value deployment gaps first, including private registry support, health / lifecycle controls, and pod composition hooks, while keeping the PR cohesive and reviewable.

3. Implement everything at once, including PDB, RBAC, and generic extra objects — rejected

Although these features are valid chart capabilities, they extend beyond pod template extensibility and would make the first implementation significantly broader. Mixing pod-level changes with additional chart-managed resources would increase review complexity and make the initial rollout harder to reason about.

Validation

  • helm lint charts/openab passes
  • Rendering coverage added for:
    • imagePullSecrets (global + per-agent)
    • per-agent imagePullPolicy
    • initContainers
    • sidecars
    • extra volumes / mounts
    • probes (livenessProbe, readinessProbe, startupProbe)
    • lifecycle
    • serviceAccountName
    • pod annotations and labels (global + per-agent merged)
  • Hardening coverage added for:
    • reserved selector labels cannot be hijacked by user podLabels
    • checksum/config annotation cannot be overridden by user podAnnotations
    • per-agent imagePullSecrets: [] opts out of the global list
    • per-agent podLabels override global for the same non-reserved key
  • Manual deploy-oriented verification on a real cluster

Validation command:

bash charts/openab/tests/helm-template-test.sh

Result: 18 passed, 0 failed

Open questions

  1. Should follow-up work add PodDisruptionBudget before or after chart-managed ServiceAccount support?
  2. Should generic extra objects be exposed via extraDeploy, extraObjects, or deferred further?
  3. Should this PR stay with "reference an externally-created ServiceAccount" (the current approach), or should follow-up work add chart-managed ServiceAccount creation with RBAC?

@kirkchen kirkchen force-pushed the feat/helm-chart-extensibility branch from 665ab78 to eb21a81 Compare April 16, 2026 01:13
@thekkagent thekkagent force-pushed the feat/helm-chart-extensibility branch from 6ad4050 to 0e70a04 Compare April 16, 2026 11:01
thekkagent and others added 6 commits April 17, 2026 10:00
Add pod/deployment extensibility to the Helm chart, focused on
Deployment.spec.template scope: imagePullSecrets (global + per-agent
with opt-out), per-agent imagePullPolicy, serviceAccountName, probes,
lifecycle, initContainers, sidecars, extraVolumes, extraVolumeMounts,
podAnnotations and podLabels with reserved-key protection.

Also fixes Test 1-7 to pass botToken after upstream discord.enabled
gate change.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The env map previously only supported simple key-value strings. This
adds polymorphic rendering via kindIs "map" so users can use valueFrom
(fieldRef, secretKeyRef, configMapKeyRef, etc.) while maintaining full
backward compatibility with existing string values.

In deployment.yaml, map-typed values render as toYaml (e.g. valueFrom
blocks). In configmap.yaml, map-typed values are filtered out since
config.toml only supports string env vars — valueFrom is a Kubernetes
concept handled at the pod level.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rename `sidecars` to `extraContainers` to avoid confusion with K8s 1.28+
native sidecar support — this field renders into the main containers
array, not as native sidecars.

Also adds a "Pod extensibility" comment header and documents that `{}`
means disabled for probes and lifecycle fields.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…imagePullSecrets format

- Fix pre-existing TOML injection risk where double quotes in env string
  values produced invalid TOML. Use toJson for proper escaping.
- Add kindIs "slice" guard in both deployment.yaml and configmap.yaml to
  reject list-typed env values with a clear error message.
- Derive agentPodLabels reserved keys from selectorLabels helper to
  prevent future drift between Deployment selector and pod labels.
- Support both K8s native object format ({ name: regcred }) and
  shorthand string format for imagePullSecrets.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the deleted helm-template-test.sh with helm unittest YAML files:
- deployment_test.yaml: pod extensibility controls (18 tests)
- env_test.yaml: polymorphic env rendering + configmap filtering (5 tests)

Upstream already migrated to helm unittest (configmap_test.yaml,
adapter-enablement_test.yaml). Total: 36 tests across 5 suites.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… helper

Before: configmap used strict truthy check (`if ($cfg.discord).enabled`)
while deployment and secret used default-true (`ne toString "false"`).
This asymmetry came from upstream fix commits and caused different
behavior for edge cases across templates.

Introduces `openab.discordEnabled` helper following the same pattern
as `openab.agentEnabled` and `openab.persistenceEnabled`. Returns
"false" when discord config is absent or explicitly disabled, "true"
otherwise. All three templates (configmap, deployment, secret) now
use the helper for consistent behavior.

Also strengthens the reserved-label-hijack test to assert both
`app.kubernetes.io/name` and `app.kubernetes.io/component` are
protected.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@thekkagent thekkagent force-pushed the feat/helm-chart-extensibility branch from 0e70a04 to 8dad081 Compare April 17, 2026 08:22
@github-actions
Copy link
Copy Markdown

⚠️ This PR is missing a Discord Discussion URL in the body.

All PRs must reference a prior Discord discussion to ensure community alignment before implementation.

Please edit the PR description to include a link like:

Discord Discussion URL: https://discord.com/channels/...

This PR will be automatically closed in 3 days if the link is not added.

Phase 2 of helm chart extensibility (openabdev#397):

- ServiceAccount (optional, opt-in via serviceAccount.create)
  - Supports custom name, annotations (e.g. IRSA), and automountServiceAccountToken
  - Backward compat: existing serviceAccountName still works for external SAs
- Role + RoleBinding (optional, opt-in via rbac.create)
- ClusterRole + ClusterRoleBinding (optional, opt-in via rbac.createClusterRole)
- PodDisruptionBudget (optional, opt-in via podDisruptionBudget.enabled)
  - Supports both minAvailable and maxUnavailable (mutually exclusive)
  - values.yaml warns against minAvailable: 1 with the hardcoded replicas: 1

All resources are per-agent and follow the existing multi-agent loop pattern.
All defaults are opt-in (false) to preserve existing behavior.

25 new tests added across 3 suites (serviceaccount, rbac, pdb).
Total: 61 tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔒 Auto-closing: this PR has had the closing-soon label for more than 3 days without a Discord Discussion URL being added.

Feel free to reopen after adding the discussion link to the PR body.

@github-actions github-actions bot closed this Apr 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant