Skip to content

Conversation

@arturshadnik
Copy link
Member

@arturshadnik arturshadnik commented Sep 8, 2025

Prior to this fix, it was possible for a race condition to occur where the handleAddonConfigs function would be executed before the AddOnTemplate CRD was created. This would cause an error which would be set as a negative condition on the FC, regardless of if any addons were actually configured. Once the CRD was available and the function returned a success, the negative condition would persist and not be removed, causing the FC to remain Unhealthy. This fix ensures that both positive and negative conditions are only set if there are addons being created or deleted.

Summary by CodeRabbit

  • New Features

    • Status now reflects whether add-on or hub add-on operations actually changed system state.
  • Bug Fixes

    • Prevents falsely marking add-ons as configured when no changes occurred.
    • Improves error reporting during add-on install/uninstall to show true failure states.
  • Refactor

    • Add-on workflows now use explicit change detection to reduce unnecessary updates and improve state reporting.

Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
@coderabbitai
Copy link

coderabbitai bot commented Sep 8, 2025

Walkthrough

handleAddonConfig and handleHubAddons now return (bool, error) to signal whether reconciliation changed state; hub.go captures these flags, adjusts error handling and FleetConfig condition setting based on change signals, and updates InstalledHubAddOns/status accordingly.

Changes

Cohort / File(s) Summary
Addon handlers — signature and control flow
fleetconfig-controller/internal/controller/v1alpha1/addon.go
Refactors handleAddonConfig and handleHubAddons to return (bool, error) indicating whether reconciliation changed state; propagates change flags through list/create/delete/install/uninstall paths and updates return values to signal changes on success or failure.
Call sites and condition handling
fleetconfig-controller/internal/controller/v1alpha1/hub.go
Captures addonConfigChanged and hubAddonChanged flags, changes error handling to consider whether a change occurred before marking negative conditions, and sets FleetConfigAddonsConfigured based on change detection; retains hub upgrade flow and updates status assignment of installed hub addons.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

lgtm

Suggested reviewers

  • ahmad-ibra
  • TylerGillson

Pre-merge checks (2 passed, 1 warning)

❌ Failed Checks (1 warning)
Check Name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed Checks (2 passed)
Check Name Status Explanation
Title Check ✅ Passed The title succinctly describes the primary bug fix by preventing negative addon conditions when no addon operations occur, directly matching the changeset.
Description Check ✅ Passed The description clearly explains the root cause of the race condition and the rationale for only setting conditions when addon operations occur, giving reviewers sufficient context for the changes.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
		  - name: "Undocumented Breaking Changes"
			  mode: "warning"
			  instructions: |
				  Flag potential breaking changes that are not documented:
				  1. Identify changes to public APIs/exports, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints (including removed/renamed items and changes to types, required params, return values, defaults, or behavior).
				  2. Ignore purely internal/private changes (e.g., code not exported from package entry points or marked internal).
				  3. Verify documentation exists: a "Breaking Change" section in the PR description and updates to CHANGELOG.md.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6f15c29 and 7af8b83.

📒 Files selected for processing (1)
  • fleetconfig-controller/internal/controller/v1alpha1/addon.go (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • fleetconfig-controller/internal/controller/v1alpha1/addon.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: e2e (fleetconfig-controller) / e2e
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added the approved label Sep 8, 2025
Copy link
Collaborator

@TylerGillson TylerGillson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will work, but can we also log a useful warning message with level 1 if the crds are not found?

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
fleetconfig-controller/internal/controller/v1alpha1/addon.go (2)

186-205: Fix nil-pointer deref when AddOnTemplate is NotFound

If Get() returns NotFound, addon is nil, but code later dereferences addon.Spec.AddonName. This will panic.

 		addon, err := addonC.AddonV1alpha1().AddOnTemplates().Get(ctx, addonName, metav1.GetOptions{})
 		if err != nil && !kerrs.IsNotFound(err) {
 			errs = append(errs, fmt.Errorf("failed to delete addon %s: %v", addonName, err))
 			continue
 		}
 
 		// delete the addon template
 		if addon != nil {
 			err = addonC.AddonV1alpha1().AddOnTemplates().Delete(ctx, addonName, metav1.DeleteOptions{})
 			if err != nil && !kerrs.IsNotFound(err) {
 				errs = append(errs, fmt.Errorf("failed to delete addon %s: %v", addonName, err))
 				continue
 			}
 		}
 
-		baseAddonName := addon.Spec.AddonName
+		if addon == nil {
+			// Template already gone; base AddOn name unknown. Skip purge evaluation for this entry.
+			logger.V(5).Info("addon template not found; skipping purge evaluation", "AddOnTemplate", addonName)
+			continue
+		}
+		baseAddonName := addon.Spec.AddonName

If you prefer to still attempt purge without the template, we can derive the base name from our naming convention, but that’s lossy; above is safer.


411-469: Return “changed” only when install/uninstall ops are needed

Same issue as handleAddonConfig: this currently returns changed=true even on no-ops. Compute changed from the diff and propagate it in error paths and the final return.

 	// do uninstalls first, then installs
-	err := handleHubAddonUninstall(ctx, addonsToUninstall, fc)
-	if err != nil {
-		return true, err
-	}
-
-	err = handleHubAddonInstall(ctx, addonC, addonsToInstall, bundleVersion, fc)
-	if err != nil {
-		return true, err
-	}
+	changed := len(addonsToUninstall) > 0 || len(addonsToInstall) > 0
+	if err := handleHubAddonUninstall(ctx, addonsToUninstall, fc); err != nil {
+		return changed, err
+	}
+	if err := handleHubAddonInstall(ctx, addonC, addonsToInstall, bundleVersion, fc); err != nil {
+		return changed, err
+	}
 
 	// build the new installed addons list
 	// ...
 	fc.Status.InstalledHubAddOns = newInstalledAddOns
-	return true, nil
+	return changed, nil
🧹 Nitpick comments (2)
fleetconfig-controller/internal/controller/v1alpha1/hub.go (2)

89-96: Log suppressed errors when no addon ops were planned

When err != nil and addonConfigChanged==false, we intentionally don’t surface a negative condition. Add a low-verbosity log to aid triage.

-	addonConfigChanged, err := handleAddonConfig(ctx, kClient, addonC, fc)
+	addonConfigChanged, err := handleAddonConfig(ctx, kClient, addonC, fc)
 	if err != nil && addonConfigChanged {
 		fc.SetConditions(true, v1alpha1.NewCondition(
 			err.Error(), v1alpha1.FleetConfigAddonsConfigured, metav1.ConditionFalse, metav1.ConditionTrue,
 		))
 		return err
 	}
+	if err != nil && !addonConfigChanged {
+		log.FromContext(ctx).V(5).Info("ignoring addon-config error (no addon ops planned)", "error", err)
+	}

97-104: Same: log suppressed hub-addon errors when no ops were planned

-	hubAddonChanged, err := handleHubAddons(ctx, addonC, fc)
+	hubAddonChanged, err := handleHubAddons(ctx, addonC, fc)
 	if err != nil && hubAddonChanged {
 		fc.SetConditions(true, v1alpha1.NewCondition(
 			err.Error(), v1alpha1.FleetConfigAddonsConfigured, metav1.ConditionFalse, metav1.ConditionTrue,
 		))
 		return err
 	}
+	if err != nil && !hubAddonChanged {
+		log.FromContext(ctx).V(5).Info("ignoring hub-addon error (no addon ops planned)", "error", err)
+	}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c8efab6 and 6f15c29.

📒 Files selected for processing (2)
  • fleetconfig-controller/internal/controller/v1alpha1/addon.go (6 hunks)
  • fleetconfig-controller/internal/controller/v1alpha1/hub.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
fleetconfig-controller/internal/controller/v1alpha1/hub.go (2)
fleetconfig-controller/api/v1alpha1/fleetconfig_types.go (1)
  • NewCondition (146-157)
fleetconfig-controller/api/v1alpha1/constants.go (1)
  • FleetConfigAddonsConfigured (16-16)
fleetconfig-controller/internal/controller/v1alpha1/addon.go (3)
fleetconfig-controller/vendor/open-cluster-management.io/api/client/addon/clientset/versioned/clientset.go (1)
  • Clientset (21-24)
fleetconfig-controller/api/v1alpha1/fleetconfig_types.go (1)
  • FleetConfig (732-738)
fleetconfig-controller/api/v1alpha1/constants.go (1)
  • ManagedBySelector (98-98)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: e2e (fleetconfig-controller) / e2e
🔇 Additional comments (2)
fleetconfig-controller/internal/controller/v1alpha1/addon.go (1)

42-47: Good: no negative condition when CRD list fails and no addons are configured

Returning (false, err) when requestedAddOns is empty ensures we don’t publish a negative condition for a no-op scenario (race with CRD creation). Looks aligned with the PR objective.

fleetconfig-controller/internal/controller/v1alpha1/hub.go (1)

105-111: Condition-setting relies on correct changed flags—verify after handler fixes

Once the handlers return accurate changed flags (see addon.go comments), this block will align with the PR objective. Without those fixes, this may still set success on no-ops.

Would you like me to open a follow-up PR after we merge the handler fixes to add an integration test ensuring we set AddonsConfigured only when actual create/delete/install/uninstall occurs?

Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
@openshift-ci
Copy link

openshift-ci bot commented Sep 8, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: arturshadnik, TylerGillson

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit 7b2a16d into open-cluster-management-io:main Sep 8, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants