feat(helm): add pod extensibility controls by thekkagent · Pull Request #2 · thekkagent/openab

thekkagent · 2026-04-15T12:11:45Z

What problem does this solve?

OpenAB's Helm chart is currently strong on the minimal install path, but it is too thin for many real Kubernetes deployments.

It does not currently expose several common deployment controls that operators typically expect from a reusable open-source chart, including:

imagePullSecrets
probes (livenessProbe, readinessProbe, startupProbe)
lifecycle
initContainers
sidecars
extra volumes / volume mounts
serviceAccountName binding
PodDisruptionBudget
extra ConfigMap / Secret injection patterns
other pod-level deployment settings that are common in production clusters

This leaves users with a few unattractive options:

rebuild a custom image for relatively minor deployment-specific needs
patch rendered manifests after helm template
fork the chart just to add standard Kubernetes fields

One immediate example is that the chart does not currently support imagePullSecrets, which makes private registry deployments harder than they need to be.

Closes: N/A

Discord Discussion URL: https://discordapp.com/channels/1491295327620169908/1493841502529523732

At a Glance

┌──────────────────────────────┐
│ Current OpenAB Helm Chart    │
│ minimal install path         │
└──────────────┬───────────────┘
               │
               ▼
┌──────────────────────────────┐
│ Missing common chart hooks   │
│ - imagePullSecrets           │
│ - probes / lifecycle         │
│ - initContainers / sidecars  │
│ - extra volumes / mounts     │
│ - serviceAccount             │
└──────────────┬───────────────┘
               │
               ▼
┌──────────────────────────────┐
│ This PR                      │
│ pod / deployment             │
│ extensibility only           │
└──────────────────────────────┘

Prior Art & Industry Research

OpenClaw:

I reviewed the local OpenClaw repository and its Kubernetes deployment manifests. While OpenClaw does not currently ship a Helm chart in this repo, it does treat startup bootstrap as a first-class deployment concern. In particular, it uses an initContainer to prepare configuration and workspace state before the main container starts.

This suggests:

initContainers are a reasonable place for startup preparation
bootstrap logic should be treated as deployment design, not an ad hoc workaround
a hardened main container can stay simpler when initialization is separated

Reference:

scripts/k8s/manifests/deployment.yaml

Hermes Agent:

I reviewed Hermes Agent's Docker and deployment documentation. Hermes takes a clearer stance on tool installation: stable toolchains should primarily be handled through custom images or clearly defined mutable runtime environments, not by stretching the chart into a package manager.

This suggests:

serious toolchains should prefer custom images
runtime bootstrap can exist, but it should stay lightweight and bounded
Helm should expose deployment extension points, not replace image design

References:

Dockerfile
website/docs/user-guide/docker.md
website/docs/getting-started/nix-setup.md

Other references:

I reviewed Bitnami charts and Helm / Kubernetes best practices. Bitnami's more mature charts commonly expose imagePullSecrets, serviceAccount, probes, lifecycle, initContainers, sidecars, extraVolumes, extraVolumeMounts, extraEnvVarsCM, extraEnvVarsSecret, extraDeploy, and pdb. This PR adopts the subset of that surface area that fits under Deployment.spec.template.

Relevant upstream guidance also supports this direction:

Helm chart values and template best practices
Kubernetes guidance for initContainers
Kubernetes guidance for private registry pulls via imagePullSecrets

Proposed Solution

This PR implements the first phase of pod / deployment extensibility for the OpenAB Helm chart, focused entirely on Deployment.spec.template.

File	Purpose
`charts/openab/values.yaml`	Add first-phase Helm values for pod / deployment extensibility
`charts/openab/templates/_helpers.tpl`	Helper resolution for per-agent image pull policy, image pull secrets (with explicit opt-out), service account binding, and reserved-key-safe merging for pod annotations and labels
`charts/openab/templates/deployment.yaml`	Render pod-level extensibility controls into `Deployment.spec.template`
`charts/openab/tests/helm-template-test.sh`	Coverage for the new rendering behaviors, hijack-protection guard tests, and a fix for the test harness counter bug under `set -e`

Implemented scope:

imagePullSecrets (global + per-agent, with explicit opt-out)
per-agent imagePullPolicy (global default + per-agent override)
per-agent serviceAccountName binding
livenessProbe, readinessProbe, startupProbe, lifecycle
initContainers, sidecars
extraVolumes, extraVolumeMounts
podAnnotations, podLabels (global + per-agent, merged)

Explicitly out of scope:

PodDisruptionBudget
chart-managed ServiceAccount creation
RBAC resources
generic extra objects such as extraDeploy

For the "install tools" question specifically, the implementation keeps two clear paths:

Custom image — the preferred path for stable, repeatable, production-grade toolchains
initContainers + shared volume — a lightweight bootstrap path for small binaries or startup initialization

Safety & correctness

Two design decisions are load-bearing for correctness.

Reserved pod-metadata keys cannot be hijacked. agentPodLabels and agentPodAnnotations use mergeOverwrite(dict, global, per-agent, reserved), where the reserved key set is merged last. This guarantees:

checksum/config annotation is always the chart-computed hash — users cannot accidentally pin it or disable rollout-on-config-change
app.kubernetes.io/{name,instance,component} labels on the pod template always match spec.selector.matchLabels, so Deployment → Pod selector matching can never be broken by a user-supplied podLabels entry

Without this protection, a user who set podLabels.app.kubernetes.io/name: custom would silently produce duplicate YAML keys and a Deployment whose ReplicaSet could no longer match its own Pods.

Per-agent imagePullSecrets: [] explicitly opts out of the global list. The helper uses hasKey instead of a truthy check, so three states are distinguishable per agent:

key omitted → inherit .Values.imagePullSecrets
imagePullSecrets: [] → opt out (no pull secrets, even if global is set)
imagePullSecrets: [foo] → replace with per-agent list

Why this approach?

I do not think OpenAB should model every Kubernetes field at once, and I do not think Helm should become the primary mechanism for packaging arbitrary tools.

At the same time, the current chart is thin enough that users are pushed toward forking it for fairly normal deployment requirements. That creates unnecessary friction for a public chart.

This approach takes the middle path:

keep the default chart simple
expose the extension points that operators commonly expect
keep custom images as the primary answer for serious tool installation
treat initContainers bootstrap as a lightweight complement, not the main packaging model
phase the work so the chart does not grow all at once

Tradeoffs and limitations:

the chart surface will grow
pod-metadata merge rules must be unambiguous and predictable — addressed by the reserved-last mergeOverwrite pattern described above
features like PDB or RBAC should stay optional to avoid over-designing the chart too early

Alternatives Considered

The main implementation question is not whether Helm extensibility should be improved, but how much should be included in the first implementation PR.

1. Do the minimum: only `imagePullSecrets` — rejected

This would solve the most immediate private-registry gap, but it would still leave the chart without probes, lifecycle hooks, and pod composition controls such as initContainers, sidecars, and extra volumes. The result would still feel incomplete for real deployments.

2. Implement pod / deployment extensibility only — chosen

This keeps the first implementation focused on a single surface area: Deployment.spec.template. It addresses the highest-value deployment gaps first, including private registry support, health / lifecycle controls, and pod composition hooks, while keeping the PR cohesive and reviewable.

3. Implement everything at once, including PDB, RBAC, and generic extra objects — rejected

Although these features are valid chart capabilities, they extend beyond pod template extensibility and would make the first implementation significantly broader. Mixing pod-level changes with additional chart-managed resources would increase review complexity and make the initial rollout harder to reason about.

Validation

helm lint charts/openab passes
Rendering coverage added for:
- imagePullSecrets (global + per-agent)
- per-agent imagePullPolicy
- initContainers
- sidecars
- extra volumes / mounts
- probes (livenessProbe, readinessProbe, startupProbe)
- lifecycle
- serviceAccountName
- pod annotations and labels (global + per-agent merged)
Hardening coverage added for:
- reserved selector labels cannot be hijacked by user podLabels
- checksum/config annotation cannot be overridden by user podAnnotations
- per-agent imagePullSecrets: [] opts out of the global list
- per-agent podLabels override global for the same non-reserved key
Manual deploy-oriented verification on a real cluster

Validation command:

bash charts/openab/tests/helm-template-test.sh

Result: 18 passed, 0 failed

Open questions

Should follow-up work add PodDisruptionBudget before or after chart-managed ServiceAccount support?
Should generic extra objects be exposed via extraDeploy, extraObjects, or deferred further?
Should this PR stay with "reference an externally-created ServiceAccount" (the current approach), or should follow-up work add chart-managed ServiceAccount creation with RBAC?

Add pod/deployment extensibility to the Helm chart, focused on Deployment.spec.template scope: imagePullSecrets (global + per-agent with opt-out), per-agent imagePullPolicy, serviceAccountName, probes, lifecycle, initContainers, sidecars, extraVolumes, extraVolumeMounts, podAnnotations and podLabels with reserved-key protection. Also fixes Test 1-7 to pass botToken after upstream discord.enabled gate change. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The env map previously only supported simple key-value strings. This adds polymorphic rendering via kindIs "map" so users can use valueFrom (fieldRef, secretKeyRef, configMapKeyRef, etc.) while maintaining full backward compatibility with existing string values. In deployment.yaml, map-typed values render as toYaml (e.g. valueFrom blocks). In configmap.yaml, map-typed values are filtered out since config.toml only supports string env vars — valueFrom is a Kubernetes concept handled at the pod level. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Rename `sidecars` to `extraContainers` to avoid confusion with K8s 1.28+ native sidecar support — this field renders into the main containers array, not as native sidecars. Also adds a "Pod extensibility" comment header and documents that `{}` means disabled for probes and lifecycle fields. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…imagePullSecrets format - Fix pre-existing TOML injection risk where double quotes in env string values produced invalid TOML. Use toJson for proper escaping. - Add kindIs "slice" guard in both deployment.yaml and configmap.yaml to reject list-typed env values with a clear error message. - Derive agentPodLabels reserved keys from selectorLabels helper to prevent future drift between Deployment selector and pod labels. - Support both K8s native object format ({ name: regcred }) and shorthand string format for imagePullSecrets. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace the deleted helm-template-test.sh with helm unittest YAML files: - deployment_test.yaml: pod extensibility controls (18 tests) - env_test.yaml: polymorphic env rendering + configmap filtering (5 tests) Upstream already migrated to helm unittest (configmap_test.yaml, adapter-enablement_test.yaml). Total: 36 tests across 5 suites. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… helper Before: configmap used strict truthy check (`if ($cfg.discord).enabled`) while deployment and secret used default-true (`ne toString "false"`). This asymmetry came from upstream fix commits and caused different behavior for edge cases across templates. Introduces `openab.discordEnabled` helper following the same pattern as `openab.agentEnabled` and `openab.persistenceEnabled`. Returns "false" when discord config is absent or explicitly disabled, "true" otherwise. All three templates (configmap, deployment, secret) now use the helper for consistent behavior. Also strengthens the reserved-label-hijack test to assert both `app.kubernetes.io/name` and `app.kubernetes.io/component` are protected. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-17T08:22:57Z

⚠️ This PR is missing a Discord Discussion URL in the body.

All PRs must reference a prior Discord discussion to ensure community alignment before implementation.

Please edit the PR description to include a link like:

Discord Discussion URL: https://discord.com/channels/...

This PR will be automatically closed in 3 days if the link is not added.

Phase 2 of helm chart extensibility (openabdev#397): - ServiceAccount (optional, opt-in via serviceAccount.create) - Supports custom name, annotations (e.g. IRSA), and automountServiceAccountToken - Backward compat: existing serviceAccountName still works for external SAs - Role + RoleBinding (optional, opt-in via rbac.create) - ClusterRole + ClusterRoleBinding (optional, opt-in via rbac.createClusterRole) - PodDisruptionBudget (optional, opt-in via podDisruptionBudget.enabled) - Supports both minAvailable and maxUnavailable (mutually exclusive) - values.yaml warns against minAvailable: 1 with the hardcoded replicas: 1 All resources are per-agent and follow the existing multi-agent loop pattern. All defaults are opt-in (false) to preserve existing behavior. 25 new tests added across 3 suites (serviceaccount, rbac, pdb). Total: 61 tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-20T10:49:24Z

🔒 Auto-closing: this PR has had the closing-soon label for more than 3 days without a Discord Discussion URL being added.

Feel free to reopen after adding the discussion link to the PR body.

thekkagent mentioned this pull request Apr 15, 2026

docs: helm chart extensibility enhancement #1

Closed

3 tasks

kirkchen force-pushed the feat/helm-chart-extensibility branch from 665ab78 to eb21a81 Compare April 16, 2026 01:13

thekkagent force-pushed the feat/helm-chart-extensibility branch from 6ad4050 to 0e70a04 Compare April 16, 2026 11:01

thekkagent and others added 6 commits April 17, 2026 10:00

thekkagent force-pushed the feat/helm-chart-extensibility branch from 0e70a04 to 8dad081 Compare April 17, 2026 08:22

github-actions bot added the closing-soon label Apr 17, 2026

github-actions bot closed this Apr 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(helm): add pod extensibility controls#2

feat(helm): add pod extensibility controls#2
thekkagent wants to merge 7 commits intomainfrom
feat/helm-chart-extensibility

thekkagent commented Apr 15, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 17, 2026

Uh oh!

github-actions bot commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

thekkagent commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this solve?

At a Glance

Prior Art & Industry Research

Proposed Solution

Safety & correctness

Why this approach?

Alternatives Considered

1. Do the minimum: only imagePullSecrets — rejected

2. Implement pod / deployment extensibility only — chosen

3. Implement everything at once, including PDB, RBAC, and generic extra objects — rejected

Validation

Open questions

Uh oh!

github-actions bot commented Apr 17, 2026

Uh oh!

github-actions bot commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

thekkagent commented Apr 15, 2026 •

edited

Loading

1. Do the minimum: only `imagePullSecrets` — rejected