Skip to content

Conversation

@arturshadnik
Copy link
Member

@arturshadnik arturshadnik commented Nov 19, 2025

  • Adds deploymentConfig field to Spoke.Spec
  • Deletes Spoke.Spec.AddOn.InstallNamespace field, replaced by AddOnDeploymentConfigSpec.AgentInstallNamespace
  • For each AddOn where a deploymentConfig is provided, adds the --config-file flag to clusteradm addon enable
  • Implements hash based config revision tracking to ensure updates are correctly propagated
  • Updates e2e tests to verify updates to AddOn variables
  • Adds a docs page explaining how to use the AddOn and AddOnConfig APIs

Summary by CodeRabbit

  • New Features

    • Rich DeploymentConfig for addons (customized variables, placement, proxy, image registries, resource constraints) and automatic detection of config changes to trigger redeploys.
  • Documentation

    • Added comprehensive addon framework guide describing configuration and workflows.
  • Chores

    • Bumped clusteradm installer version.
  • Tests

    • Updated end-to-end tests to verify addon variable resolution and redeploy behavior.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
@coderabbitai
Copy link

coderabbitai bot commented Nov 19, 2025

Walkthrough

Adds addon deployment configuration support: replaces AddOn.InstallNamespace with DeploymentConfig, introduces a deployment-config hash annotation, refactors addon enable/disable to use a temporary deployment-config file passed to clusteradm, updates CRD/schema, generated deepcopy, Dockerfile versions, docs, and e2e tests.

Changes

Cohort / File(s) Summary
API constants
fleetconfig-controller/api/v1beta1/constants.go
Adds exported constant AnnotationAddOnDeploymentConfigHash = "fleetconfig.open-cluster-management.io/configHash"
API types
fleetconfig-controller/api/v1beta1/spoke_types.go
Replaces AddOn.InstallNamespace with AddOn.DeploymentConfig *addonv1alpha1.AddOnDeploymentConfigSpec; adds import addonv1alpha1
Generated deepcopy
fleetconfig-controller/api/v1beta1/zz_generated.deepcopy.go
Adds import for addon v1alpha1 and deep-copy logic for DeploymentConfig with nil check
Build files
fleetconfig-controller/build/Dockerfile.base, fleetconfig-controller/build/Dockerfile.devspace, fleetconfig-controller/build/Dockerfile.eks, fleetconfig-controller/build/Dockerfile.gke, fleetconfig-controller/hack/.versions.env
Bumps CLUSTERADM_VERSION from 1.0.2 to 1.1.0 across Dockerfiles and versions env
CRD / Schema
fleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yaml
Replaces installNamespace string with deploymentConfig object and adds nested fields: agentInstallNamespace, customizedVariables, nodePlacement, proxyConfig, registries, resourceRequirements and validation/metadata
Controller logic
fleetconfig-controller/internal/controller/v1beta1/addon.go
Adds createAddonDeploymentConfigFile(); computes deployment-config hash annotation; refactors enable/disable flow to use config-file + hash; removes patchFCCMca() and related in-place MCA patching; updates function signatures/usages
Spoke handler
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
Adds configureFCCAddOn() to merge FCC addon DeploymentConfig with hub/spoke vars, sorts customized variables, uses deepcopy of spoke for FCC config, and updates AddonsConfigured signaling
Documentation
fleetconfig-controller/docs/addon-framework.md
Adds comprehensive addon framework documentation (custom addons, built-in hub addons, workflows, command construction, best practices, troubleshooting)
Test fixtures
fleetconfig-controller/test/data/addon-2-cm.yaml, fleetconfig-controller/test/data/fleetconfig-values.yaml
Updates test addon namespace (test-addon), adds ConfigMap with addon variables and adds deploymentConfig entries to spoke addon test values
E2E helpers & tests
fleetconfig-controller/test/e2e/helper.go, fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go
Adds helper constants and ensureAddonVariablesResolved(); updates tests to verify initial addon variable resolution and addon template/version update flows; imports addonv1alpha1 in tests

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

  • Areas needing extra attention:
    • fleetconfig-controller/internal/controller/v1beta1/addon.go — hashing, temp-file creation, clusteradm invocation, and enable/disable semantics.
    • CRD schema changes and spoke_types.go — API surface change and validation semantics for DeploymentConfig.
    • E2E tests — ensure test updates align with new deploymentConfig behavior and variable resolution.

Possibly related PRs

Suggested labels

lgtm

Suggested reviewers

  • ahmad-ibra
  • TylerGillson

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title "expose addon configuration in Spoke API" accurately reflects the main objective of adding a deploymentConfig field to the Spoke API and replacing the simple InstallNamespace with comprehensive AddOnDeploymentConfigSpec, which is the core change throughout the changeset.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link

openshift-ci bot commented Nov 19, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: arturshadnik

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@arturshadnik arturshadnik marked this pull request as ready for review November 19, 2025 20:48
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (1)

359-363: Double‑check AddonsConfigured condition semantics across enable/cleanup paths

You now set AddonsConfigured to True:

  • After successfully enabling addons in doHubWork.
  • After disabling addons in hubCleanupPreflight (“Success – addons disabled/cleaned up”).
  • After deleting the FCC agent addon in waitForAgentAddonDeleted.

Later, after waitForAddonManifestWorksCleanup, you flip it to False to signal full cleanup.

If consumers interpret status.conditions[AddonsConfigured].status strictly as “addons are currently configured/enabled”, the intermediate True states during teardown may be confusing. If instead it’s meant as a more generic “addon operations succeeded” flag, this is fine, but it may be worth:

  • Using more specific reasons/messages for the cleanup paths, or
  • Documenting that False specifically means “all addon manifestWorks cleaned up; no enabled addons remain”.

Not a blocker, but clarifying this now will avoid ambiguity for future readers or external controllers.

Also applies to: 670-673, 717-720

fleetconfig-controller/internal/controller/v1beta1/addon.go (1)

398-464: Good partial success handling for addon enablement!

The function correctly tracks which addons were successfully enabled even when others fail:

  • Collects errors in enableErrs slice (lines 398, 438, 452)
  • Only adds to enabledAddons on success (line 456)
  • Returns both successful addons and aggregated error (lines 460-462)

This allows the caller to update status with partial progress, which is much better than an all-or-nothing approach.

Optional: Consider using errors.Join (Go 1.20+) for better error aggregation:

 	if len(enableErrs) > 0 {
-		return enabledAddons, fmt.Errorf("one or more addons were not enabled: %v", enableErrs)
+		return enabledAddons, fmt.Errorf("one or more addons were not enabled: %w", errors.Join(enableErrs...))
 	}

This provides better error unwrapping support for callers.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9353cb0 and 0678274.

📒 Files selected for processing (16)
  • fleetconfig-controller/api/v1beta1/constants.go (1 hunks)
  • fleetconfig-controller/api/v1beta1/spoke_types.go (2 hunks)
  • fleetconfig-controller/api/v1beta1/zz_generated.deepcopy.go (2 hunks)
  • fleetconfig-controller/build/Dockerfile.base (1 hunks)
  • fleetconfig-controller/build/Dockerfile.devspace (1 hunks)
  • fleetconfig-controller/build/Dockerfile.eks (1 hunks)
  • fleetconfig-controller/build/Dockerfile.gke (1 hunks)
  • fleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yaml (1 hunks)
  • fleetconfig-controller/docs/addon-framework.md (1 hunks)
  • fleetconfig-controller/hack/.versions.env (1 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/addon.go (6 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (4 hunks)
  • fleetconfig-controller/test/data/addon-2-cm.yaml (1 hunks)
  • fleetconfig-controller/test/data/fleetconfig-values.yaml (2 hunks)
  • fleetconfig-controller/test/e2e/helper.go (4 hunks)
  • fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go (3 hunks)
🧰 Additional context used
🧠 Learnings (14)
📚 Learning: 2025-08-22T17:55:52.159Z
Learnt from: TylerGillson
Repo: open-cluster-management-io/lab PR: 51
File: fleetconfig-controller/charts/fleetconfig-controller/README.md:155-155
Timestamp: 2025-08-22T17:55:52.159Z
Learning: In the open-cluster-management-io/lab repository, chart versioning for fleetconfig-controller is handled automatically via GitHub release workflows, not through manual version bumps in Chart.yaml during regular PRs.

Applied to files:

  • fleetconfig-controller/hack/.versions.env
  • fleetconfig-controller/build/Dockerfile.base
  • fleetconfig-controller/build/Dockerfile.gke
  • fleetconfig-controller/build/Dockerfile.devspace
  • fleetconfig-controller/test/data/fleetconfig-values.yaml
  • fleetconfig-controller/build/Dockerfile.eks
  • fleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yaml
📚 Learning: 2025-08-27T21:58:32.141Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 58
File: fleetconfig-controller/charts/fleetconfig-controller/README.md:155-155
Timestamp: 2025-08-27T21:58:32.141Z
Learning: In the open-cluster-management-io/lab repository, the fleetconfig-controller follows a workflow where chart version bumps (in README.md and values.yaml) are included in PRs before the corresponding Docker image exists. The Docker image is built and pushed automatically via GitHub release workflows after the PR is merged and tagged, making the referenced version available.

Applied to files:

  • fleetconfig-controller/hack/.versions.env
  • fleetconfig-controller/build/Dockerfile.base
  • fleetconfig-controller/build/Dockerfile.gke
  • fleetconfig-controller/build/Dockerfile.devspace
  • fleetconfig-controller/build/Dockerfile.eks
📚 Learning: 2025-09-22T18:42:03.404Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 59
File: fleetconfig-controller/PROJECT:28-31
Timestamp: 2025-09-22T18:42:03.404Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller, the PROJECT file defines multiple API resources with different webhook configurations: FleetConfig v1alpha1 has defaulting: true (requiring MutatingWebhookConfiguration), while Hub and Spoke v1beta1 resources have defaulting: false. MutatingWebhookConfiguration resources in the manifests serve the v1alpha1 FleetConfig, not the v1beta1 Hub/Spoke resources.

Applied to files:

  • fleetconfig-controller/build/Dockerfile.base
  • fleetconfig-controller/docs/addon-framework.md
  • fleetconfig-controller/build/Dockerfile.devspace
  • fleetconfig-controller/api/v1beta1/spoke_types.go
  • fleetconfig-controller/test/data/fleetconfig-values.yaml
  • fleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yaml
📚 Learning: 2025-09-25T23:31:11.630Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml:110-112
Timestamp: 2025-09-25T23:31:11.630Z
Learning: The fleetconfig-controller-manager spoke agent requires create/update/patch/delete permissions on CustomResourceDefinitions because `clusteradm upgrade klusterlet` operations need create/update permissions and cleanup operations require delete permissions for proper lifecycle management.

Applied to files:

  • fleetconfig-controller/build/Dockerfile.base
  • fleetconfig-controller/test/data/fleetconfig-values.yaml
  • fleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yaml
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.

Applied to files:

  • fleetconfig-controller/api/v1beta1/spoke_types.go
  • fleetconfig-controller/test/e2e/helper.go
  • fleetconfig-controller/test/data/fleetconfig-values.yaml
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
  • fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go
  • fleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yaml
📚 Learning: 2025-09-22T19:16:34.109Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 59
File: fleetconfig-controller/internal/webhook/v1beta1/validation.go:103-121
Timestamp: 2025-09-22T19:16:34.109Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller v1beta1 API, the Klusterlet field in SpokeSpec is defined as a struct value (Klusterlet Klusterlet), not a pointer (*Klusterlet), so direct field access like Klusterlet.Annotations is safe without nil checks. The Klusterlet struct does not contain a Source field.

Applied to files:

  • fleetconfig-controller/api/v1beta1/spoke_types.go
  • fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go
  • fleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yaml
📚 Learning: 2025-09-26T04:15:34.475Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/internal/webhook/v1beta1/validation_test.go:699-701
Timestamp: 2025-09-26T04:15:34.475Z
Learning: In fleetconfig-controller/internal/webhook/v1beta1/validation_test.go, the TestValidateAddonUniqueness test has flawed assertion logic at lines 763-769 that uses `if len(err.Error()) >= len(expectedMsg)` for partial matching, which causes false positives when actual error messages are longer than expected messages, allowing tests to pass with completely incorrect expected error messages.

Applied to files:

  • fleetconfig-controller/test/e2e/helper.go
  • fleetconfig-controller/internal/controller/v1beta1/addon.go
  • fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go
📚 Learning: 2025-09-26T04:13:12.146Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/internal/webhook/v1beta1/validation_test.go:719-724
Timestamp: 2025-09-26T04:13:12.146Z
Learning: The TestValidateAddonUniqueness function in fleetconfig-controller/internal/webhook/v1beta1/validation_test.go has flawed assertion logic that uses `len(err.Error()) >= len(expectedMsg)` to check error messages, which doesn't actually validate message content and will pass for any error longer than the expected message.

Applied to files:

  • fleetconfig-controller/test/e2e/helper.go
  • fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go
📚 Learning: 2025-09-25T23:26:18.327Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/cmd/manager/manager.go:290-292
Timestamp: 2025-09-25T23:26:18.327Z
Learning: ManagedClusterAddOn hub kubeconfigs in Open Cluster Management use certificate-authority-data (base64-encoded inline CA certificate) rather than certificate-authority (file path reference). This makes the kubeconfig self-contained and eliminates the need to patch CA file paths.

Applied to files:

  • fleetconfig-controller/test/e2e/helper.go
  • fleetconfig-controller/test/data/fleetconfig-values.yaml
  • fleetconfig-controller/internal/controller/v1beta1/addon.go
  • fleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yaml
  • fleetconfig-controller/test/data/addon-2-cm.yaml
📚 Learning: 2025-10-01T20:56:57.301Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/cmd/manager/manager.go:274-277
Timestamp: 2025-10-01T20:56:57.301Z
Learning: In fleetconfig-controller/cmd/manager/manager.go, the hub kubeconfig read by getHubRestConfig() is auto-generated and mounted with a consistent format, not user-supplied. The blanket string replacement for tls.crt and tls.key paths is suitable for this controlled environment.

Applied to files:

  • fleetconfig-controller/test/e2e/helper.go
  • fleetconfig-controller/test/data/fleetconfig-values.yaml
📚 Learning: 2025-09-22T19:26:11.020Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 59
File: fleetconfig-controller/test/data/fleetconfig-v1alpha1.yaml:47-53
Timestamp: 2025-09-22T19:26:11.020Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller tests, the kubeconfigKey is intentionally set to "value" in test fixtures (fleetconfig-v1alpha1.yaml, fleetconfig-values.yaml) because that's how the test harness provisions the kubeconfig secret during test setup. This differs from the chart default of "kubeconfig" but is correct for the test environment.

Applied to files:

  • fleetconfig-controller/test/data/fleetconfig-values.yaml
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: The waitForAddonManifestWorksCleanup function is called specifically during spoke cleanup/unjoin operations, only when addons were previously enabled, and after addon disable operations. In this controlled context, checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks since other ManifestWorks would have been handled earlier in the cleanup flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/addon.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: In the fleetconfig-controller addon cleanup flow, waitForAddonManifestWorksCleanup is only called in contexts where non-addon ManifestWorks have already been handled, so checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-25T23:18:41.573Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.573Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
🧬 Code graph analysis (4)
fleetconfig-controller/test/e2e/helper.go (1)
fleetconfig-controller/test/utils/utils.go (2)
  • WarnError (72-75)
  • Info (66-69)
fleetconfig-controller/internal/controller/v1beta1/addon.go (5)
fleetconfig-controller/api/v1beta1/constants.go (2)
  • ManagedBySelector (208-208)
  • AnnotationAddOnDeploymentConfigHash (169-169)
fleetconfig-controller/api/v1beta1/spoke_types.go (1)
  • AddOn (278-291)
fleetconfig-controller/internal/hash/hash.go (1)
  • ComputeHash (11-22)
fleetconfig-controller/internal/args/args.go (1)
  • SanitizeArgs (82-84)
fleetconfig-controller/internal/exec/exec.go (1)
  • CmdWithLogs (18-54)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (3)
fleetconfig-controller/api/v1beta1/constants.go (7)
  • AddonsConfigured (22-22)
  • InstanceTypeUnified (89-89)
  • FCCAddOnName (116-116)
  • ControllerNamespaceEnvVar (107-107)
  • HubNamespaceEnvVar (104-104)
  • SpokeNamespaceEnvVar (101-101)
  • PurgeAgentNamespaceEnvVar (113-113)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
  • SpokeReconciler (52-58)
fleetconfig-controller/api/v1beta1/spoke_types.go (3)
  • Spoke (319-333)
  • HubRef (119-127)
  • CleanupConfig (92-116)
fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go (1)
fleetconfig-controller/test/utils/utils.go (1)
  • WarnError (72-75)
🪛 LanguageTool
fleetconfig-controller/docs/addon-framework.md

[grammar] ~5-~5: Ensure spelling is correct
Context: ...g clusteradm commands, you can define addons in your Hub and Spoke resources, and FC...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~11-~11: Ensure spelling is correct
Context: ...cy Framework) via HubAddOn ## Custom Addons ### How It Works FCC wraps the `clusteradm ...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~22-~22: Ensure spelling is correct
Context: ...configuration ### Configuring a Custom Addon #### Step 1: Create a ConfigMap with Addon Ma...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~55-~55: Ensure spelling is correct
Context: ...ne AddOnConfig in Hub Resource Add the addon configuration to your Hub resource: ``...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~131-~131: Ensure spelling is correct
Context: ...n Hub Addons FCC includes two built-in addons that can be easily enabled without requ...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~165-~165: Ensure spelling is correct
Context: ... Addons on Spoke Clusters Built-in hub addons can be enabled on spoke clusters the sa...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~180-~180: Ensure spelling is correct
Context: ...ork ``` ## Implementation Details The addon creation logic in FCC (found in `intern...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~212-~212: Ensure spelling is correct
Context: ...# For Built-in Hub Addons Built-in hub addons follow a similar pattern but use pre-co...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~216-~216: Ensure spelling is correct
Context: ...ecify explicit versions for your custom addons to ensure predictable deployments. [Sem...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (20)
fleetconfig-controller/hack/.versions.env (1)

3-3: CLUSTERADM version bump looks consistent

CLUSTERADM_VERSION=1.1.0 here aligns with the Dockerfile updates; no issues from this file’s perspective.

fleetconfig-controller/build/Dockerfile.devspace (1)

19-19: Devspace image now standardizes on CLUSTERADM 1.1.0

ARG CLUSTERADM_VERSION is updated and consistent with the rest of the build tooling.

fleetconfig-controller/build/Dockerfile.gke (1)

40-40: GKE build CLUSTERADM version aligned

The dependencies stage now uses CLUSTERADM_VERSION=1.1.0, matching other build targets.

fleetconfig-controller/build/Dockerfile.eks (1)

41-41: EKS build now uses CLUSTERADM 1.1.0

The version bump is consistent with other Dockerfiles and the hack/.versions.env file.

fleetconfig-controller/build/Dockerfile.base (1)

41-41: Base image clusteradm version standardized

Base dependencies now install clusteradm 1.1.0, keeping all images in sync.

fleetconfig-controller/test/data/addon-2-cm.yaml (1)

10-20: Addon test ConfigMap and variable placeholders look correct

Using a shared test-addon namespace and adding the second addon-variables-test ConfigMap with CLUSTER_NAME, FOO, and NEW_KEY matches the described e2e variable-resolution flow; the embedded multi-doc YAML is well-formed.

fleetconfig-controller/test/data/fleetconfig-values.yaml (2)

13-21: Hub test addon manifest extended for variables

The additional ConfigMap in manifests with CLUSTER_NAME and FOO placeholders is structurally sound and lines up with the addon variable-resolution tests.


50-56: Spoke deploymentConfig wiring matches addon variable usage

deploymentConfig with agentInstallNamespace: test-addon and customizedVariables for CLUSTER_NAME/FOO aligns with the hub-side templates and the new AddOnDeploymentConfig API.

fleetconfig-controller/api/v1beta1/constants.go (1)

165-170: New deployment config hash annotation constant looks good

AnnotationAddOnDeploymentConfigHash follows the existing naming pattern and clearly documents its role in tracking add-on deployment configuration revisions.

fleetconfig-controller/api/v1beta1/zz_generated.deepcopy.go (1)

25-26: Deep copy for DeploymentConfig is correctly wired

The new import and DeploymentConfig deep‑copy block follow the existing pointer pattern and safely handle the nil case; this keeps AddOn copies isolated when AddOnDeploymentConfigSpec is present.

Also applies to: 38-42

fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go (1)

107-110: Addon variable resolution e2e flow looks consistent

The new tests that (1) assert initial addon variables, (2) update only the FOO variable via the Spoke’s AddOns[i].DeploymentConfig.CustomizedVariables, and (3) verify behavior after bumping the addon template version align with ensureAddonVariablesResolved and addonData in helper.go. The use of Eventually around both the spec update and the assertions should give the controller enough time to reconcile.

Also applies to: 244-288

fleetconfig-controller/test/e2e/helper.go (2)

43-48: Addon variable helpers are cohesive and robust

Centralizing the ConfigMap/keys as constants, reusing addonData for namespace/version metadata, and encapsulating the CLUSTER_NAME/FOO/NEW_KEY checks in ensureAddonVariablesResolved gives you clear, retry‑friendly assertions with deterministic logging. The extra NEW_KEY assertion only for addonIdx == 1 cleanly models the v2.0.0 behavior.

Also applies to: 68-84, 422-468


493-499: Updated Hub log message improves clarity

Changing the warning message to “failed to get Hub” correctly reflects the resource being fetched in updateHubAddon.

fleetconfig-controller/api/v1beta1/spoke_types.go (1)

26-27: AddOn DeploymentConfig field is well-integrated

Wiring DeploymentConfig as *addonv1alpha1.AddOnDeploymentConfigSpec with omitempty cleanly exposes deployment configuration without breaking existing callers when nil, and matches the generated deepcopy logic. This is a sensible API surface for driving AddOnDeploymentConfig creation.

Also applies to: 277-291

fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (1)

347-365: Using a Spoke deep copy for FCC addon config before enable is a good pattern

Deep‑copying the Spoke, applying FCC‑specific deployment config via configureFCCAddOn, and passing the copy to handleSpokeAddons keeps the reconciler input immutable while still recording EnabledAddons and AddonsConfigured on the real Spoke. This is a clean separation of transient addon‑enable configuration from persisted status.

fleetconfig-controller/docs/addon-framework.md (1)

1-243: Excellent documentation for the addon framework!

This documentation is comprehensive and well-structured. It clearly explains:

  • The distinction between custom and built-in addons
  • Step-by-step setup instructions with working examples
  • The new deploymentConfig field and its capabilities
  • Implementation details for developers
  • Best practices and troubleshooting guidance

The code examples accurately reflect the CRD structure and the deploymentConfig examples on lines 108-115 properly demonstrate the new functionality.

Note: The static analysis warnings about "addon" spelling are false positives - both "addon" and "add-on" are acceptable in technical documentation, and your usage is consistent.

fleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yaml (1)

63-300: Well-structured CRD update for deployment configuration!

The new deploymentConfig field provides comprehensive configuration options for addon deployment. The CRD includes:

  • Proper validation rules (DNS-1123 labels, variable name patterns)
  • Appropriate defaults (e.g., agentInstallNamespace defaulting to open-cluster-management-agent-addon)
  • Correct use of Kubernetes list semantics (map types for unique entries)
  • Comprehensive configuration options matching OCM's AddOnDeploymentConfigSpec

The structure enables fine-grained control over addon deployment including node placement, resource requirements, proxy settings, and image registries.

fleetconfig-controller/internal/controller/v1beta1/addon.go (3)

290-343: Solid hash-based change detection implementation!

The approach of computing a hash of the entire AddOn struct and storing it in annotations is clean and effective for detecting configuration changes. Key strengths:

  • Conservative error handling: if hash computation fails, the addon will be re-enabled (line 324)
  • Correct detection of changes via hash comparison (line 337)
  • Proper disable-then-enable flow since clusteradm doesn't support in-place updates (lines 339-342)

One aspect to consider: The hash includes the entire AddOn struct, so changes to Annotations or ConfigName will also trigger re-enablement. This is likely the desired behavior since annotations affect the MCA resource, but worth documenting if users might be surprised by annotation-only changes causing addon restarts.


405-443: Clean implementation of config hash and deployment config file handling!

The code properly:

  • Computes config hash per addon with error handling (lines 405-410)
  • Only adds hash annotation when computation succeeds (line 417)
  • Preserves user annotations while adding the system hash annotation (lines 412-419)
  • Creates temporary deployment config files when needed (lines 431-443)
  • Handles cleanup via deferred functions (line 435)

One minor note: When processing multiple addons with DeploymentConfig, all temporary files will exist until the function completes since cleanup is deferred (line 435). This is acceptable since temp files are small and automatically cleaned up, but consider using defer cleanup() inside a function closure if memory usage becomes a concern for large addon lists.


466-495: Well-implemented temporary config file generation!

The createAddonDeploymentConfigFile function properly:

  • Constructs a valid AddOnDeploymentConfig resource with correct TypeMeta and ObjectMeta (lines 470-480)
  • Uses the managed cluster's namespace (spoke.Name) for the config resource, following OCM conventions
  • Marshals to YAML using the appropriate library that handles Kubernetes types (line 483)
  • Returns a cleanup function following the standard defer pattern (line 494)
  • Provides comprehensive error handling at each step

The function integrates cleanly with the existing temporary file management utilities.

Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
fleetconfig-controller/internal/controller/v1beta1/addon.go (1)

398-403: Same hash computation error issue as in handleSpokeAddons.

This has the same problem as noted at lines 320-325 in handleSpokeAddons. When hash computation fails, configHash = "" is set, and the addon will be enabled with an empty hash annotation. On the next reconciliation, the empty hash won't match the newly computed hash, triggering unnecessary re-enable.

Consider the same fix: fail fast or skip the addon when hash computation fails.

🧹 Nitpick comments (3)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (2)

359-363: Consider setting AddonsConfigured condition even when no addons are enabled.

The condition is only set to True when len(enabledAddons) > 0. If a user explicitly removes all addons from the spec (setting AddOns: []), the condition won't be updated, leaving it in whatever state it was previously. This could mislead users about the actual addon state.

Consider setting the condition in all cases:

-	if len(enabledAddons) > 0 {
-		spoke.SetConditions(true, v1beta1.NewCondition(
-			v1beta1.AddonsConfigured, v1beta1.AddonsConfigured, metav1.ConditionTrue, metav1.ConditionTrue,
-		))
-	}
+	if len(enabledAddons) > 0 {
+		spoke.SetConditions(true, v1beta1.NewCondition(
+			v1beta1.AddonsConfigured, v1beta1.AddonsConfigured, metav1.ConditionTrue, metav1.ConditionTrue,
+		))
+	} else {
+		spoke.SetConditions(true, v1beta1.NewCondition(
+			"No addons configured", v1beta1.AddonsConfigured, metav1.ConditionFalse, metav1.ConditionTrue,
+		))
+	}

721-724: Reconsider AddonsConfigured condition state after addon deletion.

Similar to the cleanup path at lines 674-677, this sets AddonsConfigured to True immediately after successfully deleting the FCC agent addon (line 715). Since the addon is being removed, setting the condition to False with a message like "Addons disabled" would more accurately reflect the cluster state.

fleetconfig-controller/internal/controller/v1beta1/addon.go (1)

426-433: Multiple deferred cleanups are safe but consider noting complexity.

The cleanup function from createAddonDeploymentConfigFile is deferred inside the loop (line 428). If multiple addons have DeploymentConfig, multiple deferred cleanups will accumulate and execute in LIFO order when the function returns. This is safe and correct, but could be less obvious to future maintainers.

The current implementation is fine, but if clarity is preferred, you could collect cleanup functions and call them explicitly after the loop.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0678274 and 3fe6acd.

📒 Files selected for processing (2)
  • fleetconfig-controller/internal/controller/v1beta1/addon.go (6 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (4 hunks)
🧰 Additional context used
🧠 Learnings (11)
📓 Common learnings
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/cmd/manager/manager.go:290-292
Timestamp: 2025-09-25T23:26:18.327Z
Learning: ManagedClusterAddOn hub kubeconfigs in Open Cluster Management use certificate-authority-data (base64-encoded inline CA certificate) rather than certificate-authority (file path reference). This makes the kubeconfig self-contained and eliminates the need to patch CA file paths.
📚 Learning: 2025-09-25T23:18:41.573Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.573Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-26T04:15:34.475Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/internal/webhook/v1beta1/validation_test.go:699-701
Timestamp: 2025-09-26T04:15:34.475Z
Learning: In fleetconfig-controller/internal/webhook/v1beta1/validation_test.go, the TestValidateAddonUniqueness test has flawed assertion logic at lines 763-769 that uses `if len(err.Error()) >= len(expectedMsg)` for partial matching, which causes false positives when actual error messages are longer than expected messages, allowing tests to pass with completely incorrect expected error messages.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
  • fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: The waitForAddonManifestWorksCleanup function is called specifically during spoke cleanup/unjoin operations, only when addons were previously enabled, and after addon disable operations. In this controlled context, checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks since other ManifestWorks would have been handled earlier in the cleanup flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
  • fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: In the fleetconfig-controller addon cleanup flow, waitForAddonManifestWorksCleanup is only called in contexts where non-addon ManifestWorks have already been handled, so checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-25T23:26:18.327Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/cmd/manager/manager.go:290-292
Timestamp: 2025-09-25T23:26:18.327Z
Learning: ManagedClusterAddOn hub kubeconfigs in Open Cluster Management use certificate-authority-data (base64-encoded inline CA certificate) rather than certificate-authority (file path reference). This makes the kubeconfig self-contained and eliminates the need to patch CA file paths.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-26T04:13:12.146Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/internal/webhook/v1beta1/validation_test.go:719-724
Timestamp: 2025-09-26T04:13:12.146Z
Learning: The TestValidateAddonUniqueness function in fleetconfig-controller/internal/webhook/v1beta1/validation_test.go has flawed assertion logic that uses `len(err.Error()) >= len(expectedMsg)` to check error messages, which doesn't actually validate message content and will pass for any error longer than the expected message.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/addon.go
🧬 Code graph analysis (2)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (3)
fleetconfig-controller/api/v1beta1/constants.go (7)
  • AddonsConfigured (22-22)
  • InstanceTypeUnified (89-89)
  • FCCAddOnName (116-116)
  • ControllerNamespaceEnvVar (107-107)
  • HubNamespaceEnvVar (104-104)
  • SpokeNamespaceEnvVar (101-101)
  • PurgeAgentNamespaceEnvVar (113-113)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
  • SpokeReconciler (52-58)
fleetconfig-controller/api/v1beta1/spoke_types.go (3)
  • Spoke (319-333)
  • HubRef (119-127)
  • CleanupConfig (92-116)
fleetconfig-controller/internal/controller/v1beta1/addon.go (5)
fleetconfig-controller/api/v1beta1/constants.go (2)
  • ManagedBySelector (208-208)
  • AnnotationAddOnDeploymentConfigHash (169-169)
fleetconfig-controller/api/v1beta1/spoke_types.go (1)
  • AddOn (278-291)
fleetconfig-controller/internal/hash/hash.go (1)
  • ComputeHash (11-22)
fleetconfig-controller/internal/args/args.go (1)
  • SanitizeArgs (82-84)
fleetconfig-controller/internal/exec/exec.go (1)
  • CmdWithLogs (18-54)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
🔇 Additional comments (7)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (3)

368-383: LGTM - Guard logic correctly prevents panic.

The fix using fccIdx := -1 initialization and early return when no FCC addon is found correctly addresses the critical issue flagged in the previous review. This ensures the function won't panic on empty or non-matching spoke.Spec.AddOns.


432-435: Good use of deterministic sorting.

Sorting the CustomizedVariables by name is essential to ensure the hash computation remains stable across reconciliation loops. Without this, identical configurations could produce different hashes due to map iteration order, causing unnecessary addon re-enables.


674-677: Clarify the AddonsConfigured condition state transition.

After calling handleSpokeAddons to disable addons (line 668), the code sets AddonsConfigured to True (lines 674-677). Later, at lines 704-706, it's set to False after waiting for cleanup. This state transition (True → False) after a disable operation seems counterintuitive.

Is the intent to signal "addon configuration successfully applied" (even if that configuration is to disable them) before transitioning to "no addons configured"? If so, consider clarifying with a more descriptive message, or consolidate the condition updates to avoid the intermediate True state.

fleetconfig-controller/internal/controller/v1beta1/addon.go (4)

290-302: LGTM - Efficient approach to read current addon state.

Building a map of ManagedClusterAddOn objects to access their config hash annotations is efficient and avoids N+1 API calls. The error handling gracefully defaults to an empty list, ensuring the reconciliation continues even if the List operation fails.


320-321: Verify: Hashing entire AddOn struct includes user annotations.

The hash is computed on the entire AddOn struct, which includes ConfigName, Annotations, and DeploymentConfig. This means changes to user-supplied annotations (not just DeploymentConfig) will trigger addon re-enable.

If this is intentional (to detect any addon configuration change), document it. If only DeploymentConfig changes should trigger re-enable, consider hashing only that field:

configHash, err := hash.ComputeHash(addon.DeploymentConfig)

353-363: LGTM - Error handling correctly preserves state on disable failure.

The code now correctly returns enabledAddons unchanged when handleAddonDisable fails (lines 354-356), accurately reflecting that the cluster state hasn't changed. Only after successful disable does it update the list (lines 358-363). This addresses the issue flagged in the previous review.


459-488: LGTM - Clean implementation of config file generation.

The function properly constructs an AddOnDeploymentConfig resource with correct TypeMeta and ObjectMeta, marshals it to YAML, and writes to a temporary file with proper cleanup handling. The returned cleanup function follows the established pattern in the codebase.

@ahmad-ibra
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Nov 19, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit 95e37c1 into open-cluster-management-io:main Nov 19, 2025
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants