Skip to content

Feature/ha service self enable and disable#308

Merged
PaulPickhardt merged 3 commits into
mainfrom
feature/ha-service-self-enable-and-disable
May 29, 2026
Merged

Feature/ha service self enable and disable#308
PaulPickhardt merged 3 commits into
mainfrom
feature/ha-service-self-enable-and-disable

Conversation

@PaulPickhardt
Copy link
Copy Markdown
Contributor

@PaulPickhardt PaulPickhardt commented May 20, 2026

Summary by CodeRabbit

Release Notes

  • Refactor
    • Removed HA (High Availability) enablement management from the hypervisor controller. HA state management is now delegated to a dedicated service, improving system architecture and providing cleaner separation of responsibilities across components.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 20, 2026

Warning

Review limit reached

@PaulPickhardt, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 42 minutes and 8 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 653a9b2c-60b1-415d-aef3-944f40836623

📥 Commits

Reviewing files that changed from the base of the PR and between 74d7082 and 5ffc9ef.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (7)
  • cmd/main.go
  • go.mod
  • internal/controller/hypervisor_instance_ha_controller.go
  • internal/controller/hypervisor_instance_ha_controller_test.go
  • internal/controller/hypervisor_maintenance_controller.go
  • internal/controller/hypervisor_maintenance_controller_test.go
  • internal/controller/utils.go
📝 Walkthrough

Walkthrough

The PR removes the internal HypervisorInstanceHaController from the controller manager wiring and deletes its HTTP-based HA utility functions. The HypervisorMaintenanceController is updated to no longer handle the MaintenanceHA mode, delegating HA management to an external service. Test expectations are updated to reflect these changes.

Changes

Externalize HA service management

Layer / File(s) Summary
Remove internal HA controller and utilities
cmd/main.go, internal/controller/utils.go
Removes HypervisorInstanceHaController wiring from the controller manager setup and deletes the InstanceHaUrl function and related HTTP POST/enable/disable HA helpers, reducing imports to only remaining node-condition utilities.
Update maintenance controller for HA delegation
internal/controller/hypervisor_maintenance_controller.go, internal/controller/hypervisor_maintenance_controller_test.go
MaintenanceUnset now explicitly clears forced_down on compute service updates, and the disable switch is narrowed to exclude MaintenanceHA mode. Tests are updated with dedicated HA context asserting no service disable action and revised Nova API expectations for the empty maintenance case.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

Suggested reviewers

  • mchristianl
  • fwiesel

Poem

🐰 The HA controller hops away,
Its duties passed to service gray,
While maintenance clears the forced_down way,
The operator's role grows light today!
External services lead the way

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title describes a feature being added, but the changes actually remove the HypervisorInstanceHaController and delegate HA management to a separate service, which is a refactoring/architectural change, not a straightforward feature addition. Clarify the title to better reflect that this refactors HA management by removing the controller and delegating to kvm-ha-service, e.g., 'Delegate HA management to kvm-ha-service' or 'Refactor: Remove HypervisorInstanceHaController'.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/ha-service-self-enable-and-disable

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@PaulPickhardt PaulPickhardt force-pushed the feature/ha-service-self-enable-and-disable branch from 7b2f4f8 to 1663db8 Compare May 21, 2026 08:17
@PaulPickhardt PaulPickhardt marked this pull request as ready for review May 22, 2026 10:09
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/controller/hypervisor_maintenance_controller.go (1)

117-135: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Line 123 early-return can skip clearing forced_down in MaintenanceUnset.

If ConditionTypeHypervisorDisabled is already false, reconciliation returns before the Nova update call, so forced_down=false may never be sent.

Proposed fix
-		if !meta.SetStatusCondition(&hv.Status.Conditions, metav1.Condition{
+		_ = meta.SetStatusCondition(&hv.Status.Conditions, metav1.Condition{
 			Type:    kvmv1.ConditionTypeHypervisorDisabled,
 			Status:  metav1.ConditionFalse,
 			Message: "Hypervisor is enabled",
 			Reason:  kvmv1.ConditionReasonSucceeded,
-		}) {
-			// Spec matches status
-			return nil
-		}
+		})
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/controller/hypervisor_maintenance_controller.go` around lines 117 -
135, The current early-return when SetStatusCondition(&hv.Status.Conditions,
metav1.Condition{Type: kvmv1.ConditionTypeHypervisorDisabled, Status:
metav1.ConditionFalse, ...}) returns false causes the Nova Update
(services.Update using enableService) to be skipped and thus forced_down may
never be cleared; in MaintenanceUnset, remove or change the early return so the
enableService update (services.Update(ctx, hec.computeClient, serviceId,
enableService).Extract()) always runs even if the status condition was already
false — i.e., ensure the call that sets Status: services.ServiceEnabled and
ForcedDown: &falseVal executes unconditionally (or move it below the condition
block) while still avoiding duplicate status writes to hv.Status.Conditions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@charts/openstack-hypervisor-operator/templates/ha-service-rbac.yaml`:
- Line 4: The ClusterRole name using the Helm template include
"openstack-hypervisor-operator.fullname" (the value that produces "<include
...>-ha-service-role") must be quoted to avoid YAML parse/lint errors; update
the metadata name in ha-service-rbac.yaml (the resource that sets name: {{
include "openstack-hypervisor-operator.fullname" . }}-ha-service-role) to wrap
the entire templated expression in quotes so linters see a valid YAML string.

---

Outside diff comments:
In `@internal/controller/hypervisor_maintenance_controller.go`:
- Around line 117-135: The current early-return when
SetStatusCondition(&hv.Status.Conditions, metav1.Condition{Type:
kvmv1.ConditionTypeHypervisorDisabled, Status: metav1.ConditionFalse, ...})
returns false causes the Nova Update (services.Update using enableService) to be
skipped and thus forced_down may never be cleared; in MaintenanceUnset, remove
or change the early return so the enableService update (services.Update(ctx,
hec.computeClient, serviceId, enableService).Extract()) always runs even if the
status condition was already false — i.e., ensure the call that sets Status:
services.ServiceEnabled and ForcedDown: &falseVal executes unconditionally (or
move it below the condition block) while still avoiding duplicate status writes
to hv.Status.Conditions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8f6e7348-4b8c-4ef3-a17e-4547b9c8d9b1

📥 Commits

Reviewing files that changed from the base of the PR and between 65bc78f and 1663db8.

📒 Files selected for processing (7)
  • charts/openstack-hypervisor-operator/templates/ha-service-rbac.yaml
  • cmd/main.go
  • internal/controller/hypervisor_instance_ha_controller.go
  • internal/controller/hypervisor_instance_ha_controller_test.go
  • internal/controller/hypervisor_maintenance_controller.go
  • internal/controller/hypervisor_maintenance_controller_test.go
  • internal/controller/utils.go
💤 Files with no reviewable changes (4)
  • internal/controller/hypervisor_instance_ha_controller_test.go
  • cmd/main.go
  • internal/controller/hypervisor_instance_ha_controller.go
  • internal/controller/utils.go

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: {{ include "openstack-hypervisor-operator.fullname" . }}-ha-service-role
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Quote the templated ClusterRole name to avoid YAML parse/lint failure.

Line 4 is parsed as invalid YAML by linters before Helm rendering.

Proposed fix
-  name: {{ include "openstack-hypervisor-operator.fullname" . }}-ha-service-role
+  name: '{{ include "openstack-hypervisor-operator.fullname" . }}-ha-service-role'
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
name: {{ include "openstack-hypervisor-operator.fullname" . }}-ha-service-role
name: '{{ include "openstack-hypervisor-operator.fullname" . }}-ha-service-role'
🧰 Tools
🪛 YAMLlint (1.38.0)

[error] 4-4: syntax error: expected , but found ''

(syntax)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@charts/openstack-hypervisor-operator/templates/ha-service-rbac.yaml` at line
4, The ClusterRole name using the Helm template include
"openstack-hypervisor-operator.fullname" (the value that produces "<include
...>-ha-service-role") must be quoted to avoid YAML parse/lint errors; update
the metadata name in ha-service-rbac.yaml (the resource that sets name: {{
include "openstack-hypervisor-operator.fullname" . }}-ha-service-role) to wrap
the entire templated expression in quotes so linters see a valid YAML string.

Copy link
Copy Markdown
Contributor

@notandy notandy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest is fine!

Comment thread charts/openstack-hypervisor-operator/templates/ha-service-rbac.yaml Outdated
@PaulPickhardt PaulPickhardt force-pushed the feature/ha-service-self-enable-and-disable branch 2 times, most recently from 74d7082 to 84153a4 Compare May 27, 2026 13:05
  The kvm-ha-service now owns the HA lifecycle for hypervisors,
  including setting spec.maintenance=ha and the ConditionTypeHaEnabled
  and setting ConditionTypeHypervisorDisabled status condition after
  the compute service got disabled.

  Changes:
  - Delete HypervisorInstanceHaController and its tests
  - Remove enableInstanceHA, disableInstanceHA, updateInstanceHA,
    InstanceHaUrl from utils.go
  - Exclude MaintenanceHA from the maintenance controller's compute
    service disable logic, as the ha-service handles that case
  When maintenance: ha is removed, the kvm-ha-service have set
  forced_down=true on the nova-compute service. The operator's
  MaintenanceUnset path previously only re-enabled the service status
  but did not clear forced_down, leaving the compute service unable to
  accept workloads despite being marked enabled.
@PaulPickhardt PaulPickhardt force-pushed the feature/ha-service-self-enable-and-disable branch from 84153a4 to 86e5849 Compare May 27, 2026 13:07
@github-actions
Copy link
Copy Markdown

Merging this branch will decrease overall coverage

Impacted Packages Coverage Δ 🤖
github.com/cobaltcore-dev/openstack-hypervisor-operator/cmd 0.00% (ø)
github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/controller 68.98% (-1.19%) 👎

Coverage by file

Changed files (no unit tests)

Changed File Coverage Δ Total Covered Missed 🤖
github.com/cobaltcore-dev/openstack-hypervisor-operator/cmd/main.go 0.00% (ø) 0 0 0
github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/controller/hypervisor_instance_ha_controller.go 0.00% (-90.91%) 0 (-44) 0 (-40) 0 (-4) 💀 💀 💀 💀 💀
github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/controller/hypervisor_maintenance_controller.go 79.31% (+0.24%) 87 (+1) 69 (+1) 18 👍
github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/controller/utils.go 100.00% (+15.38%) 13 (-26) 13 (-20) 0 (-6) 🎉

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Changed unit test files

  • github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/controller/hypervisor_instance_ha_controller_test.go
  • github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/controller/hypervisor_maintenance_controller_test.go

@github-actions
Copy link
Copy Markdown

Merging this branch will decrease overall coverage

Impacted Packages Coverage Δ 🤖
github.com/cobaltcore-dev/openstack-hypervisor-operator/cmd 0.00% (ø)
github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/controller 68.98% (-1.19%) 👎

Coverage by file

Changed files (no unit tests)

Changed File Coverage Δ Total Covered Missed 🤖
github.com/cobaltcore-dev/openstack-hypervisor-operator/cmd/main.go 0.00% (ø) 0 0 0
github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/controller/hypervisor_instance_ha_controller.go 0.00% (-90.91%) 0 (-44) 0 (-40) 0 (-4) 💀 💀 💀 💀 💀
github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/controller/hypervisor_maintenance_controller.go 79.31% (+0.24%) 87 (+1) 69 (+1) 18 👍
github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/controller/utils.go 100.00% (+15.38%) 13 (-26) 13 (-20) 0 (-6) 🎉

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Changed unit test files

  • github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/controller/hypervisor_instance_ha_controller_test.go
  • github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/controller/hypervisor_maintenance_controller_test.go

@PaulPickhardt PaulPickhardt requested a review from notandy May 27, 2026 13:17
@PaulPickhardt PaulPickhardt merged commit 2390e7a into main May 29, 2026
7 checks passed
@PaulPickhardt PaulPickhardt deleted the feature/ha-service-self-enable-and-disable branch May 29, 2026 11:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants