You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Upgrading from openab-0.5.0 to openab-0.7.0 hit three distinct issues that compounded into extended downtime. Documenting all findings here so future PRs can reference them.
Environment
Chart: openab-0.5.0 → openab-0.7.0
Platform: Kubernetes (OrbStack, local)
Release name: openab
Issue 1: PVC data loss on upgrade (silent)
Root cause: The chart restructured from flat persistence.* to agents.<name>.persistence.*. This changes the PVC name from openab to openab-kiro. Helm treats the old PVC as "no longer part of this release" and deletes it.
Impact: All agent data lost — kiro auth (data.sqlite3), steering files, gh config, session history.
Workaround applied:
# Protect old PVC before upgrade
kubectl annotate pvc openab helm.sh/resource-policy=keep
# After upgrade, mount old PVC via rescue pod and copy data to new PVC
kubectl run pvc-rescue --image=busybox --restart=Never --overrides='...(mount old PVC)...'
kubectl cp pvc-rescue:/old/ /tmp/backup/
kubectl cp /tmp/backup/.kiro <new-pod>:/home/agent/.kiro
kubectl cp /tmp/backup/.local <new-pod>:/home/agent/.local # ← kiro auth lives here
kubectl cp /tmp/backup/.config <new-pod>:/home/agent/.config
Recommended fix (chart-level):
Add helm.sh/resource-policy: keep annotation to templates/pvc.yaml — prevents accidental PVC deletion on any future rename. The Secret template already has this annotation; PVC should too. (See PR feat(helm): add persistence.existingClaim support #166 comment)
Issue 2: image.tag defaults to stale commit hash, not appVersion
Root cause:values.yaml in chart 0.7.0 hardcodes tag: "94253a5" (an old commit). The chart metadata says appVersion: "0.7.0" and the ghcr.io/openabdev/openab:0.7.0 image exists, but the default tag doesn't point to it.
Impact: After upgrading to chart 0.7.0, the pod still runs the old binary. New features (STT voice transcription) silently don't work — the config is loaded but the code to handle it doesn't exist in the old image.
Workaround applied:
--set image.tag=0.7.0
Recommended fix: Either:
Remove the hardcoded tag from values.yaml and let the template fall back to .Chart.AppVersion (the comment already says "tag defaults to .Chart.AppVersion" but the hardcoded value overrides it)
Issue 3: helm upgrade silently drops values not explicitly passed
Root cause: Helm does not merge user-supplied values across revisions. Any value not passed in the upgrade command resets to chart defaults. We had 3 Discord channel IDs configured; the upgrade command only passed 2. The bot silently ignored messages from the third channel.
Impact: Bot appeared "down" — connected to Discord, no errors in logs, but not responding in one channel. Difficult to diagnose because the logs showed channels=2 with no warning about the change.
Workaround applied:
# Always capture full state before upgrading
helm get values <release>
kubectl get configmap <name> -o yaml
# Then pass ALL values explicitly
Recommended improvement:
The chart NOTES.txt could print the configured channel count and IDs after install/upgrade, making it easier to spot missing channels.
Consider supporting a values.yaml file approach for upgrades instead of long --set chains.
Summary
Upgrading from
openab-0.5.0toopenab-0.7.0hit three distinct issues that compounded into extended downtime. Documenting all findings here so future PRs can reference them.Environment
openab-0.5.0→openab-0.7.0openabIssue 1: PVC data loss on upgrade (silent)
Root cause: The chart restructured from flat
persistence.*toagents.<name>.persistence.*. This changes the PVC name fromopenabtoopenab-kiro. Helm treats the old PVC as "no longer part of this release" and deletes it.Impact: All agent data lost — kiro auth (
data.sqlite3), steering files, gh config, session history.Workaround applied:
Recommended fix (chart-level):
helm.sh/resource-policy: keepannotation totemplates/pvc.yaml— prevents accidental PVC deletion on any future rename. The Secret template already has this annotation; PVC should too. (See PR feat(helm): add persistence.existingClaim support #166 comment)persistence.existingClaimsupport (PR feat(helm): add persistence.existingClaim support #166, addresses feat: add persistence.existingClaim support to Helm chart #120) — lets users explicitly point to an old PVC during migration.Related: #117, #120, PR #166, external writeup
Issue 2:
image.tagdefaults to stale commit hash, notappVersionRoot cause:
values.yamlin chart 0.7.0 hardcodestag: "94253a5"(an old commit). The chart metadata saysappVersion: "0.7.0"and theghcr.io/openabdev/openab:0.7.0image exists, but the default tag doesn't point to it.Impact: After upgrading to chart 0.7.0, the pod still runs the old binary. New features (STT voice transcription) silently don't work — the config is loaded but the code to handle it doesn't exist in the old image.
Workaround applied:
Recommended fix: Either:
tagfromvalues.yamland let the template fall back to.Chart.AppVersion(the comment already says "tag defaults to .Chart.AppVersion" but the hardcoded value overrides it)Related: #235
Issue 3:
helm upgradesilently drops values not explicitly passedRoot cause: Helm does not merge user-supplied values across revisions. Any value not passed in the upgrade command resets to chart defaults. We had 3 Discord channel IDs configured; the upgrade command only passed 2. The bot silently ignored messages from the third channel.
Impact: Bot appeared "down" — connected to Discord, no errors in logs, but not responding in one channel. Difficult to diagnose because the logs showed
channels=2with no warning about the change.Workaround applied:
Recommended improvement:
values.yamlfile approach for upgrades instead of long--setchains.Suggested pre-upgrade checklist (for docs)
Data locations inside the pod (for reference)
.local/share/kiro-cli/data.sqlite3.kiro/steering/.config/gh/.kiro/settings/cli.json{}).kiro/sessions/.semantic_search/