Conversation
Bugbot caught a High-severity miss in v1.2.0's release-scoping work (PR #72). The OpenShift SCC template was the one resource-monitor file not updated when the literal `tracebloc-resource-monitor` ServiceAccount name moved to `<Release.Name>-resource-monitor`. On OpenShift the SCC granted access to a SA name that no longer existed, so the resource- monitor DaemonSet pods would fail to launch (no SCC -> can't mount hostPath /proc and /sys for node metrics). The SCC's metadata.name + ClusterRole.name + ClusterRoleBinding.name were ALREADY release-scoped (`tracebloc-resource-monitor-<release>` / `tracebloc-resource-monitor-scc-<release>`), so this slipped through — casual reading suggested it was already done. Touchpoints in resource-monitor-scc.yaml: - users[0]: now {{ include "tracebloc.resourceMonitorName" . }} - ClusterRoleBinding subjects[0].name: same helper - All `app: tracebloc-resource-monitor` labels: same helper, for consistency with the rest of the chart's resource-monitor templates - Updated the kubernetes.io/description SCC annotation prose so the literal name doesn't appear there either (cosmetic, but easier to audit "no literal references" with a single grep). Tests: - platform_test.yaml gains 3 new cases: SCC users[0] points at release-scoped SA, ClusterRoleBinding subject does too, and two releases (stg + cisco/hasan-prod) produce non-colliding SA references. - node_agents_namespace_test.yaml had a regression assertion checking the OLD literal name in users[0]; updated to the new release-scoped form (`RELEASE-NAME-resource-monitor`, helm-unittest's default release name when none is set). - 98 -> 102 passing. Verified end-to-end with two side-by-side `helm template` runs: - stg -> users[0] = system:serviceaccount:tracebloc-node-agents:stg-resource-monitor - hasan-prod -> users[0] = system:serviceaccount:tracebloc-node-agents:hasan-prod-resource-monitor Chart bumped 1.2.1 -> 1.2.2 (patch — restores OpenShift parity that v1.2.0 inadvertently broke). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bugbot follow-up to the v1.2.0/1.2.2 rename work. Two fresh issues: 1. (Medium) NOTES.txt:9 still hardcoded the literal `tracebloc-resource-monitor` for the resource-monitor DaemonSet display, while the actual DaemonSet name has been `<release>-resource-monitor` since v1.2.0. Operators see one name in the post-install banner and a different name when they `kubectl get ds`. Now routes through the same tracebloc.resourceMonitorName helper as the rest of the chart. 2. (Low) docs/migration-tools/generate.sh hardcoded `app.kubernetes.io/version: "1.1.0"` and `helm.sh/chart: client-1.1.0` on every pre-create PVC. The chart has moved through 1.1.0 → 1.2.3, and operators running generate.sh today get PVC labels stuck at 1.1.0 even though the install ahead is 1.2.3. Helm adoption itself is unaffected (it keys on meta.helm.sh/release-name, not the chart label), but the labels lie until a subsequent upgrade reconciles them, and `kubectl get pvc -L helm.sh/chart` is misleading during migration debugging. Fixed by reading name + version from client/Chart.yaml at generate time. Plus a few stale prose references caught while auditing the same path (no functional impact, but the doc was directing operators at "client fix in 1.2.0" as if it were still pending): - generate.sh inline comment on `resourceMonitor: false` rephrased from "until client-1.2.0 is published" to "until you have verified the chart you're installing is 1.2.0+" - migrate-tenant.sh banner relabelled from "v1.1.0 spec sanity" to "mysql spec sanity (v1.1.0+ shape: ...)" - README.md skip table cell on `resourceMonitor: false` rewritten to reflect that 1.2.0+ has shipped — operators on >=1.2.0 can flip it to true without colliding with the stg release Tests: 102 → 105 passing. New `client/tests/notes_test.yaml` covers: - Release-scoped resource-monitor name appears in NOTES.txt - A different release renders a different name (proves the helper isn't accidentally hardcoded) - Negative regex guards against the literal `tracebloc-resource-monitor` reappearing followed by a non-suffix character (i.e. the bare pre-1.2.3 form, while still letting the SCC line `tracebloc- resource-monitor-<release>` further down the file pass) - `resourceMonitor: false` removes the line entirely End-to-end smoke of generate.sh confirms PVCs ship with the live chart version (`helm.sh/chart: client-1.2.3` after this commit, verified against /tmp/tracebloc-migration-<demo>/pvcs.yaml). Stacked on PR #78 (v1.2.2 SCC fix), so this branch already contains the SCC SA-ref rename. Once #78 lands the diff against develop will reduce to just this commit. Chart bumped 1.2.2 → 1.2.3 (patch — operator-facing string fix + tooling correctness). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
saqlainsyed007
approved these changes
Apr 28, 2026
saadqbal
added a commit
that referenced
this pull request
Apr 28, 2026
* Merge pull request #71 from tracebloc/docs/migrations-correct-option-b docs(migrations): correct Option B + add hasan-prod case + active-jobs pre-flight * chore: add default CODEOWNERS for auto-reviewer assignment (#73) * ci: add kanban closure-routing caller workflow (#75) * fix(client): release-scope resource-monitor names so multiple releases coexist (v1.2.0) (#72) Two client releases on the same cluster could not both deploy the resource-monitor DaemonSet because several resources templated into the shared tracebloc-node-agents namespace used the literal name `tracebloc-resource-monitor` rather than a release-scoped name. The second `helm install` failed with: Error: ServiceAccount "tracebloc-resource-monitor" in namespace "tracebloc-node-agents" exists and cannot be imported into the current release: invalid ownership metadata; ... must equal "hasan-prod": current value is "stg". Surfaced during the 2026-04-27 hasan-prod migration on tracebloc-templates-prod; worked around at the time by setting resourceMonitor: false on the second release, which means prod customers currently lose their per-CLIENT_ID metric stream until this lands. What changed: - New helper `tracebloc.resourceMonitorName` -> `<Release.Name>-resource-monitor`, centralised in _helpers.tpl alongside the existing per-release name helpers (secretName, serviceAccountName, etc.). - DaemonSet metadata.name, spec.selector.matchLabels.app, pod label app=, and spec.template.spec.serviceAccountName all now go through the helper. The selector + pod label have to move together because DaemonSet selectors are namespace-scoped: two DaemonSets in tracebloc-node-agents both selecting `app: tracebloc-resource-monitor` would each grab the other's pods, which is worse than the surface bug. - ServiceAccount metadata.name (resource-monitor-rbac.yaml) goes through the helper. ClusterRole / ClusterRoleBinding / Role / RoleBinding metadata.name were already release-scoped (`tracebloc-resource-monitor-<release>`) and stay as-is to avoid an unnecessary ClusterRole rename for upgrading installs. Only the *subject* names in (Cluster)RoleBinding change to point at the new SA. - Mirrored secrets (CLIENT_ID + dockerconfigjson) in tracebloc-node-agents: the secret names were already release-scoped via tracebloc.secretName / tracebloc.registrySecretName so they did not collide. Their `app` label was the literal value, which is harmless on uniquely-named resources but inconsistent — updated for consistency. - Chart bumped 1.1.0 -> 1.2.0. Per-release naming of cluster-singleton resources is a behaviour change for existing installs (DaemonSet name, ServiceAccount name, and selector label all change), so a minor bump signals that operators should review. Tests: 93 -> 98. New cases cover: - DaemonSet name + selector + serviceAccountName all release-scoped - ServiceAccount name release-scoped - ClusterRoleBinding subject points at the release-scoped SA - A second `helm template` with a different release name produces non-colliding names Verified end-to-end via `helm template stg ./client` and `helm template hasan-prod ./client` on the same chart: ServiceAccount, DaemonSet, and ClusterRoleBinding subject names all diverge per release. Upgrade path from 1.1.0: The DaemonSet and ServiceAccount rename triggers a Helm three-way merge that DELETEs the old `tracebloc-resource-monitor` resource and CREATEs the new release-scoped one. ~30-60s gap on each node where resource metrics are not collected. DaemonSet selector is immutable, so the delete-then-create path is what we want — helm upgrade handles this automatically because the names diverge in the stored manifest. No manual orphan cleanup needed. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * fix(client): allow training pods to reach mysql-client (v1.2.1) (#76) The training-egress NetworkPolicy added in v1.1.0 only permitted DNS and external TCP/443. Training pods load their dataset from the in-namespace mysql-client over TCP/3306 (core/utils/database.py::load_dataframe_from_sql_table), so under any CNI that actually enforces NetworkPolicy the connect failed with errno 111 and the Job CrashLoopBackOff'd before the first batch: Database connection failed: 2003 (HY000): Can't connect to MySQL server on 'mysql-client:3306' (111) RuntimeError: Database connection is not available for load_dataframe_from_sql_table Surfaced on a fresh client install (k3d / k3s, which enforces policy via the built-in kube-router) where jobs-manager could reach mysql but every training Job spawned with tracebloc.io/workload=training could not. Add a third egress rule scoped to podSelector {app: mysql-client} on TCP/3306. Same-namespace by default (no namespaceSelector), so it stays tight to the chart's own mysql pod and does not open the namespace generally. The egress[1] /32 ipBlock comment is updated to note that MySQL is now explicitly re-permitted by egress[2]. Verified on a k3d cluster: pre-fix nc to mysql-client:3306 from a pod with the training label was refused; post-fix it connects. * docs(migration-tools): tenant migration runbook for eks-1.0.x → client-1.x (#74) * docs(migration-tools): tenant migration runbook for eks-1.0.x -> client-1.x Captures the operational tooling validated during the 2026-04-27 stg and hasan-prod migrations and generalises it for the remaining tenants (bmw, cisco, charite) and any future tenant on the legacy chart family. What's here: - README.md walks the workflow + recommended ordering for the pending set + skip rationale for chart toggles (resourceMonitor: false, priorityClass.create: false, etc). - generate.sh consumes a tenant-config.env (gitignored) and emits, per tenant, /tmp/tracebloc-migration-<tenant>/{values,storageclass,pvcs}.yaml. Refuses to expand placeholder __FOO__ rows so an operator running generate.sh against the unmodified template fails fast. - migrate-tenant.sh is the parameterised runbook. `phase1` is non-destructive (mysqldump-then-chunked-cp, AWS Backup on-demand recovery point, dry-run render). `phase2` is one-shot per tenant (helm uninstall, claimRef clear, SC re-create, PVC pre-create with release-scoped Helm ownership stamp, helm install, verify mysql data + keep annotation in stored manifest). - tenant-config.example.env is the template; populated copy is the secret-bearing artifact and must stay local. No real secrets in any committed file: - DOCKER_PASSWORD placeholder (__DOCKER_HUB_PERSONAL_ACCESS_TOKEN__) - per-tenant CLIENT_ID / CLIENT_PASSWORD placeholders - MYSQL_ROOT_PW placeholder (it's image-baked; required from env at runtime, no committed default) - .gitignore now excludes docs/migration-tools/tenant-config.env (only the .example variant is tracked) Operational notes: - Every kubectl/helm call passes --context explicitly. The 2026-04-27 prod run hit a context-drift bug mid-migration; the explicit form is a hard requirement. - values.yaml ships with resourceMonitor: false. Flip true after the release-scoped resource-monitor names land in client-1.2.0 (separate PR). Until then the shared SA in tracebloc-node-agents collides with the stg release. - Phase 1 is idempotent and re-runnable. Phase 2 is destructive and one-shot per tenant. Operators should pause and eyeball Phase 1 outputs before running Phase 2 — that's deliberately not automated. Once all four pending tenants are on client-1.x, this directory is historical. client-1.x -> client-1.y upgrades follow plain `helm upgrade` because the new chart already templates `helm.sh/resource-policy: keep` on PVCs, so the migration protocol isn't needed for routine upgrades. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(migration-tools): address bugbot review feedback on PR #74 Three issues flagged by Cursor Bugbot on the migration scripts: * migrate-tenant.sh used macOS-only `md5 -q` and `stat -f%z` for chunked-cp verification (HIGH). Linux operators would abort Phase 1 mid-transfer. Add portable `_md5` and `_size` helpers that pick md5sum on Linux, fall back to md5(1) on macOS, and use `wc -c` instead of stat for size. * generate.sh placeholder gate inspected only CLIENT_ID + CLIENT_PASSWORD + PV_MYSQL, missing PV_LOGS, PV_DATA, SC_NAME, and DOCKER_PASSWORD (MEDIUM). Literal `__FOO__` placeholders silently rendered into values.yaml/pvcs.yaml and only blew up at kubectl apply / helm install time. Iterate over every per-row field, plus a one-shot global check for DOCKER_PASSWORD before the loop. Error messages now name the offending field. * Phase 2.5 readiness loop was an unbounded `while :; do … sleep 5; done` (MEDIUM). After the destructive helm uninstall, a non-converging install (image-pull error, mysql kill-loop recurrence, missing PVC binding) hung the script forever instead of surfacing the failure. Add a wall-clock deadline — default 600s, override via READY_TIMEOUT — and exit 1 with the last-seen pod state on timeout. * fix(migration-tools): address bugbot follow-up on PR #74 Two more issues raised on the previous fix commit: * Readiness wait loop aborted on empty pod list (HIGH). With `set -euo pipefail`, the routine post-install window where no pods are visible yet caused `grep -c .` to exit 1, killing the script on the very first iteration before the wall-clock deadline could ever fire — defeating the bounded-wait intent. Guard the empty case explicitly. `wc -l` alone is also wrong because `echo ""` prints a newline. * MYSQL_ROOT_PW skipped the placeholder check that DOCKER_PASSWORD, CLIENT_*, and PV_* now have (LOW). An operator who copied the example without editing this row passed the non-empty gate, then the literal __LEGACY_MYSQL_ROOT_PW__ went into mysqldump and Phase 1 blew up partway through with an opaque "Access denied" inside kubectl exec. Add the same `*__*__*` case guard right after the non-empty check. * fix(migration-tools): make EFS_FS_OVERRIDE actually override (PR #74) The pre-source assignment EFS_FS="${EFS_FS_OVERRIDE:-fs-06b3faf51675ff9f9}" was a no-op: `source "$CONFIG"` runs immediately after and the example config (and any real tenant-config.env derived from it) unconditionally sets EFS_FS=fs-06b3faf51675ff9f9, so the env override was clobbered every time. Operators thinking they were targeting a non-default EFS would silently start AWS Backup on-demand jobs against the hard-coded prod filesystem. Move the override knob to AFTER source where env genuinely wins, drop the hard-coded fallback, and require EFS_FS to be set somewhere (config or override) before continuing. --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * fix(client): release-scope SCC SA refs (v1.2.2) (#78) Bugbot caught a High-severity miss in v1.2.0's release-scoping work (PR #72). The OpenShift SCC template was the one resource-monitor file not updated when the literal `tracebloc-resource-monitor` ServiceAccount name moved to `<Release.Name>-resource-monitor`. On OpenShift the SCC granted access to a SA name that no longer existed, so the resource- monitor DaemonSet pods would fail to launch (no SCC -> can't mount hostPath /proc and /sys for node metrics). The SCC's metadata.name + ClusterRole.name + ClusterRoleBinding.name were ALREADY release-scoped (`tracebloc-resource-monitor-<release>` / `tracebloc-resource-monitor-scc-<release>`), so this slipped through — casual reading suggested it was already done. Touchpoints in resource-monitor-scc.yaml: - users[0]: now {{ include "tracebloc.resourceMonitorName" . }} - ClusterRoleBinding subjects[0].name: same helper - All `app: tracebloc-resource-monitor` labels: same helper, for consistency with the rest of the chart's resource-monitor templates - Updated the kubernetes.io/description SCC annotation prose so the literal name doesn't appear there either (cosmetic, but easier to audit "no literal references" with a single grep). Tests: - platform_test.yaml gains 3 new cases: SCC users[0] points at release-scoped SA, ClusterRoleBinding subject does too, and two releases (stg + cisco/hasan-prod) produce non-colliding SA references. - node_agents_namespace_test.yaml had a regression assertion checking the OLD literal name in users[0]; updated to the new release-scoped form (`RELEASE-NAME-resource-monitor`, helm-unittest's default release name when none is set). - 98 -> 102 passing. Verified end-to-end with two side-by-side `helm template` runs: - stg -> users[0] = system:serviceaccount:tracebloc-node-agents:stg-resource-monitor - hasan-prod -> users[0] = system:serviceaccount:tracebloc-node-agents:hasan-prod-resource-monitor Chart bumped 1.2.1 -> 1.2.2 (patch — restores OpenShift parity that v1.2.0 inadvertently broke). Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * fix: NOTES.txt rename + generator chart-version drift (v1.2.3) — bugbot follow-up #2 (#80) * fix(client): release-scope SCC SA refs (v1.2.2) Bugbot caught a High-severity miss in v1.2.0's release-scoping work (PR #72). The OpenShift SCC template was the one resource-monitor file not updated when the literal `tracebloc-resource-monitor` ServiceAccount name moved to `<Release.Name>-resource-monitor`. On OpenShift the SCC granted access to a SA name that no longer existed, so the resource- monitor DaemonSet pods would fail to launch (no SCC -> can't mount hostPath /proc and /sys for node metrics). The SCC's metadata.name + ClusterRole.name + ClusterRoleBinding.name were ALREADY release-scoped (`tracebloc-resource-monitor-<release>` / `tracebloc-resource-monitor-scc-<release>`), so this slipped through — casual reading suggested it was already done. Touchpoints in resource-monitor-scc.yaml: - users[0]: now {{ include "tracebloc.resourceMonitorName" . }} - ClusterRoleBinding subjects[0].name: same helper - All `app: tracebloc-resource-monitor` labels: same helper, for consistency with the rest of the chart's resource-monitor templates - Updated the kubernetes.io/description SCC annotation prose so the literal name doesn't appear there either (cosmetic, but easier to audit "no literal references" with a single grep). Tests: - platform_test.yaml gains 3 new cases: SCC users[0] points at release-scoped SA, ClusterRoleBinding subject does too, and two releases (stg + cisco/hasan-prod) produce non-colliding SA references. - node_agents_namespace_test.yaml had a regression assertion checking the OLD literal name in users[0]; updated to the new release-scoped form (`RELEASE-NAME-resource-monitor`, helm-unittest's default release name when none is set). - 98 -> 102 passing. Verified end-to-end with two side-by-side `helm template` runs: - stg -> users[0] = system:serviceaccount:tracebloc-node-agents:stg-resource-monitor - hasan-prod -> users[0] = system:serviceaccount:tracebloc-node-agents:hasan-prod-resource-monitor Chart bumped 1.2.1 -> 1.2.2 (patch — restores OpenShift parity that v1.2.0 inadvertently broke). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix: NOTES.txt rename + generator chart-version drift (v1.2.3) Bugbot follow-up to the v1.2.0/1.2.2 rename work. Two fresh issues: 1. (Medium) NOTES.txt:9 still hardcoded the literal `tracebloc-resource-monitor` for the resource-monitor DaemonSet display, while the actual DaemonSet name has been `<release>-resource-monitor` since v1.2.0. Operators see one name in the post-install banner and a different name when they `kubectl get ds`. Now routes through the same tracebloc.resourceMonitorName helper as the rest of the chart. 2. (Low) docs/migration-tools/generate.sh hardcoded `app.kubernetes.io/version: "1.1.0"` and `helm.sh/chart: client-1.1.0` on every pre-create PVC. The chart has moved through 1.1.0 → 1.2.3, and operators running generate.sh today get PVC labels stuck at 1.1.0 even though the install ahead is 1.2.3. Helm adoption itself is unaffected (it keys on meta.helm.sh/release-name, not the chart label), but the labels lie until a subsequent upgrade reconciles them, and `kubectl get pvc -L helm.sh/chart` is misleading during migration debugging. Fixed by reading name + version from client/Chart.yaml at generate time. Plus a few stale prose references caught while auditing the same path (no functional impact, but the doc was directing operators at "client fix in 1.2.0" as if it were still pending): - generate.sh inline comment on `resourceMonitor: false` rephrased from "until client-1.2.0 is published" to "until you have verified the chart you're installing is 1.2.0+" - migrate-tenant.sh banner relabelled from "v1.1.0 spec sanity" to "mysql spec sanity (v1.1.0+ shape: ...)" - README.md skip table cell on `resourceMonitor: false` rewritten to reflect that 1.2.0+ has shipped — operators on >=1.2.0 can flip it to true without colliding with the stg release Tests: 102 → 105 passing. New `client/tests/notes_test.yaml` covers: - Release-scoped resource-monitor name appears in NOTES.txt - A different release renders a different name (proves the helper isn't accidentally hardcoded) - Negative regex guards against the literal `tracebloc-resource-monitor` reappearing followed by a non-suffix character (i.e. the bare pre-1.2.3 form, while still letting the SCC line `tracebloc- resource-monitor-<release>` further down the file pass) - `resourceMonitor: false` removes the line entirely End-to-end smoke of generate.sh confirms PVCs ship with the live chart version (`helm.sh/chart: client-1.2.3` after this commit, verified against /tmp/tracebloc-migration-<demo>/pvcs.yaml). Stacked on PR #78 (v1.2.2 SCC fix), so this branch already contains the SCC SA-ref rename. Once #78 lands the diff against develop will reduce to just this commit. Chart bumped 1.2.2 → 1.2.3 (patch — operator-facing string fix + tooling correctness). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * docs(claude): require @saadqbal as PR assignee (#79) Convention captured after a session-end ask. Every PR Claude opens for this repo must be assigned to saadqbal — orphaned PRs without an assignee fall through the review queue. Pass --assignee @me on `gh pr create` (or --assignee saadqbal if running unauthenticated). No exceptions. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: lukasWuttke <54042461+LukasWodka@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Second round of bugbot findings on the develop → main promotion (#77). Two real issues + three stale prose references caught in the same audit.
Stacked on #78 (v1.2.2 SCC fix). Once #78 lands the diff against develop reduces to just this commit. If you'd rather review them separately, I can rebase against develop after #78 merges.
Issues fixed
1. (Medium)
NOTES.txt:9showed wrong DaemonSet name post-installPre-fix output operators saw after `helm install`:
```
```
Actual DS name has been `-resource-monitor` since v1.2.0. The NOTES.txt template was missed in the rename.
Now routes through the same `tracebloc.resourceMonitorName` helper as the rest of the chart.
2. (Low)
generate.shhardcoded chart version1.1.0`docs/migration-tools/generate.sh` was written when the chart was 1.1.0 and stamped `app.kubernetes.io/version: "1.1.0"` + `helm.sh/chart: client-1.1.0` on every pre-create PVC. Chart has since moved through 1.2.0/1.2.1/1.2.2/1.2.3.
Helm adoption itself is unaffected (it keys on `meta.helm.sh/release-name`), but the labels lie until a subsequent upgrade reconciles them, and `kubectl get pvc -L helm.sh/chart` becomes misleading during migration debugging. Fixed by reading name + version from `client/Chart.yaml` at generate time.
Stale prose references (caught while auditing — no functional impact)
What I ruled out (audit notes)
I went looking for additional stale references in the same vein and intentionally did not change:
Test plan
Related
Note
Medium Risk
Operationally impactful because it changes Helm post-install output and modifies the migration artifact generator used during tenant cutovers; failures would mostly affect operator workflows and label accuracy rather than runtime behavior.
Overview
Fixes a Helm NOTES regression by rendering the Resource Monitor DaemonSet name via
tracebloc.resourceMonitorName(release-scoped) instead of a hardcodedtracebloc-resource-monitor, and adds a Helm unittest suite to prevent the literal name from reappearing and to verifyresourceMonitor: falsesuppresses the line.Improves migration tooling correctness by having
docs/migration-tools/generate.shparse chartname/versionfromclient/Chart.yamland stamp generated PVC manifests with matchingapp.kubernetes.io/versionandhelm.sh/chartlabels; accompanying docs/runbook text is updated to reflect the currentclient-1.2.0+guidance.Reviewed by Cursor Bugbot for commit 8ba1b2d. Bugbot is set up for automated code reviews on this repo. Configure here.