Skip to content

(feat) Guarantee helm chart handoff when cluster switches between profiles#1780

Merged
gianlucam76 merged 1 commit into
projectsveltos:mainfrom
gianlucam76:handoff
May 11, 2026
Merged

(feat) Guarantee helm chart handoff when cluster switches between profiles#1780
gianlucam76 merged 1 commit into
projectsveltos:mainfrom
gianlucam76:handoff

Conversation

@gianlucam76
Copy link
Copy Markdown
Member

@gianlucam76 gianlucam76 commented May 10, 2026

When a cluster atomically switches from ClusterProfile A to ClusterProfile B (both wanting the same Helm chart, e.g. Kyverno), a race condition could cause a delete+reinstall instead of an in-place upgrade:

  1. ClusterProfile A detects the cluster is no longer a match and starts undeploying
  2. ClusterProfile B has not yet run its reconciler and registered its charts with the chart manager
  3. canUninstallHelmChart sees no other registrant → proceeds with uninstall
  4. B eventually installs → unnecessary downtime

Before allowing a Helm chart uninstall, verify that every other ClusterProfile/Profile currently matching the cluster has had its ClusterSummary fully processed by the chart manager. "Fully processed" means GetRegisteredChartsCount == len(spec.HelmCharts), which proves B's reconciler has run and registered all its charts.

If not all profiles are processed yet, a new WaitForProfileProcessingError sentinel is returned, which causes the controller to requeue with a short delay (10s) rather than logging an error. Once B has registered, the existing len(otherRegistered) > 1 check becomes authoritative and the handoff proceeds as an upgrade.

Fixes #1779

@gianlucam76 gianlucam76 force-pushed the handoff branch 6 times, most recently from 2f2db99 to e6d0949 Compare May 11, 2026 11:45
…sterProfiles

When a cluster atomically switches from ClusterProfile A to ClusterProfile B (both wanting the same
Helm chart, e.g. Kyverno), a race condition could cause a delete+reinstall instead of an in-place upgrade:

1. ClusterProfile A detects the cluster is no longer a match and starts undeploying
2. ClusterProfile B has not yet run its reconciler and registered its charts with the chart manager
3. canUninstallHelmChart sees no other registrant → proceeds with uninstall
4. B eventually installs → unnecessary downtime

Before allowing a Helm chart uninstall, verify that every other ClusterProfile/Profile currently matching the
cluster has had its ClusterSummary fully processed by the chart manager.
"Fully processed" means GetRegisteredChartsCount == len(spec.HelmCharts), which proves B's reconciler has run
and registered all its charts.

If not all profiles are processed yet, a new WaitForProfileProcessingError sentinel is returned, which causes
the controller to requeue with a short delay (10s) rather than logging an error.
Once B has registered, the existing len(otherRegistered) > 1 check becomes authoritative and the handoff proceeds
as an upgrade.
@gianlucam76 gianlucam76 merged commit 9309590 into projectsveltos:main May 11, 2026
15 of 16 checks passed
@gianlucam76 gianlucam76 deleted the handoff branch May 11, 2026 19:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Guarantee upgrade semantics when switching between profiles atomically

1 participant