-
Notifications
You must be signed in to change notification settings - Fork 7
✨ expose addon configuration in Spoke API #87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ expose addon configuration in Spoke API #87
Conversation
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
WalkthroughAdds addon deployment configuration support: replaces AddOn.InstallNamespace with DeploymentConfig, introduces a deployment-config hash annotation, refactors addon enable/disable to use a temporary deployment-config file passed to clusteradm, updates CRD/schema, generated deepcopy, Dockerfile versions, docs, and e2e tests. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: arturshadnik The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (2)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (1)
359-363: Double‑checkAddonsConfiguredcondition semantics across enable/cleanup pathsYou now set
AddonsConfiguredtoTrue:
- After successfully enabling addons in
doHubWork.- After disabling addons in
hubCleanupPreflight(“Success – addons disabled/cleaned up”).- After deleting the FCC agent addon in
waitForAgentAddonDeleted.Later, after
waitForAddonManifestWorksCleanup, you flip it toFalseto signal full cleanup.If consumers interpret
status.conditions[AddonsConfigured].statusstrictly as “addons are currently configured/enabled”, the intermediateTruestates during teardown may be confusing. If instead it’s meant as a more generic “addon operations succeeded” flag, this is fine, but it may be worth:
- Using more specific reasons/messages for the cleanup paths, or
- Documenting that
Falsespecifically means “all addon manifestWorks cleaned up; no enabled addons remain”.Not a blocker, but clarifying this now will avoid ambiguity for future readers or external controllers.
Also applies to: 670-673, 717-720
fleetconfig-controller/internal/controller/v1beta1/addon.go (1)
398-464: Good partial success handling for addon enablement!The function correctly tracks which addons were successfully enabled even when others fail:
- Collects errors in
enableErrsslice (lines 398, 438, 452)- Only adds to
enabledAddonson success (line 456)- Returns both successful addons and aggregated error (lines 460-462)
This allows the caller to update status with partial progress, which is much better than an all-or-nothing approach.
Optional: Consider using
errors.Join(Go 1.20+) for better error aggregation:if len(enableErrs) > 0 { - return enabledAddons, fmt.Errorf("one or more addons were not enabled: %v", enableErrs) + return enabledAddons, fmt.Errorf("one or more addons were not enabled: %w", errors.Join(enableErrs...)) }This provides better error unwrapping support for callers.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (16)
fleetconfig-controller/api/v1beta1/constants.go(1 hunks)fleetconfig-controller/api/v1beta1/spoke_types.go(2 hunks)fleetconfig-controller/api/v1beta1/zz_generated.deepcopy.go(2 hunks)fleetconfig-controller/build/Dockerfile.base(1 hunks)fleetconfig-controller/build/Dockerfile.devspace(1 hunks)fleetconfig-controller/build/Dockerfile.eks(1 hunks)fleetconfig-controller/build/Dockerfile.gke(1 hunks)fleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yaml(1 hunks)fleetconfig-controller/docs/addon-framework.md(1 hunks)fleetconfig-controller/hack/.versions.env(1 hunks)fleetconfig-controller/internal/controller/v1beta1/addon.go(6 hunks)fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go(4 hunks)fleetconfig-controller/test/data/addon-2-cm.yaml(1 hunks)fleetconfig-controller/test/data/fleetconfig-values.yaml(2 hunks)fleetconfig-controller/test/e2e/helper.go(4 hunks)fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go(3 hunks)
🧰 Additional context used
🧠 Learnings (14)
📚 Learning: 2025-08-22T17:55:52.159Z
Learnt from: TylerGillson
Repo: open-cluster-management-io/lab PR: 51
File: fleetconfig-controller/charts/fleetconfig-controller/README.md:155-155
Timestamp: 2025-08-22T17:55:52.159Z
Learning: In the open-cluster-management-io/lab repository, chart versioning for fleetconfig-controller is handled automatically via GitHub release workflows, not through manual version bumps in Chart.yaml during regular PRs.
Applied to files:
fleetconfig-controller/hack/.versions.envfleetconfig-controller/build/Dockerfile.basefleetconfig-controller/build/Dockerfile.gkefleetconfig-controller/build/Dockerfile.devspacefleetconfig-controller/test/data/fleetconfig-values.yamlfleetconfig-controller/build/Dockerfile.eksfleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yaml
📚 Learning: 2025-08-27T21:58:32.141Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 58
File: fleetconfig-controller/charts/fleetconfig-controller/README.md:155-155
Timestamp: 2025-08-27T21:58:32.141Z
Learning: In the open-cluster-management-io/lab repository, the fleetconfig-controller follows a workflow where chart version bumps (in README.md and values.yaml) are included in PRs before the corresponding Docker image exists. The Docker image is built and pushed automatically via GitHub release workflows after the PR is merged and tagged, making the referenced version available.
Applied to files:
fleetconfig-controller/hack/.versions.envfleetconfig-controller/build/Dockerfile.basefleetconfig-controller/build/Dockerfile.gkefleetconfig-controller/build/Dockerfile.devspacefleetconfig-controller/build/Dockerfile.eks
📚 Learning: 2025-09-22T18:42:03.404Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 59
File: fleetconfig-controller/PROJECT:28-31
Timestamp: 2025-09-22T18:42:03.404Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller, the PROJECT file defines multiple API resources with different webhook configurations: FleetConfig v1alpha1 has defaulting: true (requiring MutatingWebhookConfiguration), while Hub and Spoke v1beta1 resources have defaulting: false. MutatingWebhookConfiguration resources in the manifests serve the v1alpha1 FleetConfig, not the v1beta1 Hub/Spoke resources.
Applied to files:
fleetconfig-controller/build/Dockerfile.basefleetconfig-controller/docs/addon-framework.mdfleetconfig-controller/build/Dockerfile.devspacefleetconfig-controller/api/v1beta1/spoke_types.gofleetconfig-controller/test/data/fleetconfig-values.yamlfleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yaml
📚 Learning: 2025-09-25T23:31:11.630Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml:110-112
Timestamp: 2025-09-25T23:31:11.630Z
Learning: The fleetconfig-controller-manager spoke agent requires create/update/patch/delete permissions on CustomResourceDefinitions because `clusteradm upgrade klusterlet` operations need create/update permissions and cleanup operations require delete permissions for proper lifecycle management.
Applied to files:
fleetconfig-controller/build/Dockerfile.basefleetconfig-controller/test/data/fleetconfig-values.yamlfleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yaml
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.
Applied to files:
fleetconfig-controller/api/v1beta1/spoke_types.gofleetconfig-controller/test/e2e/helper.gofleetconfig-controller/test/data/fleetconfig-values.yamlfleetconfig-controller/internal/controller/v1beta1/spoke_handler.gofleetconfig-controller/test/e2e/v1beta1_hub_spoke.gofleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yaml
📚 Learning: 2025-09-22T19:16:34.109Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 59
File: fleetconfig-controller/internal/webhook/v1beta1/validation.go:103-121
Timestamp: 2025-09-22T19:16:34.109Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller v1beta1 API, the Klusterlet field in SpokeSpec is defined as a struct value (Klusterlet Klusterlet), not a pointer (*Klusterlet), so direct field access like Klusterlet.Annotations is safe without nil checks. The Klusterlet struct does not contain a Source field.
Applied to files:
fleetconfig-controller/api/v1beta1/spoke_types.gofleetconfig-controller/test/e2e/v1beta1_hub_spoke.gofleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yaml
📚 Learning: 2025-09-26T04:15:34.475Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/internal/webhook/v1beta1/validation_test.go:699-701
Timestamp: 2025-09-26T04:15:34.475Z
Learning: In fleetconfig-controller/internal/webhook/v1beta1/validation_test.go, the TestValidateAddonUniqueness test has flawed assertion logic at lines 763-769 that uses `if len(err.Error()) >= len(expectedMsg)` for partial matching, which causes false positives when actual error messages are longer than expected messages, allowing tests to pass with completely incorrect expected error messages.
Applied to files:
fleetconfig-controller/test/e2e/helper.gofleetconfig-controller/internal/controller/v1beta1/addon.gofleetconfig-controller/test/e2e/v1beta1_hub_spoke.go
📚 Learning: 2025-09-26T04:13:12.146Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/internal/webhook/v1beta1/validation_test.go:719-724
Timestamp: 2025-09-26T04:13:12.146Z
Learning: The TestValidateAddonUniqueness function in fleetconfig-controller/internal/webhook/v1beta1/validation_test.go has flawed assertion logic that uses `len(err.Error()) >= len(expectedMsg)` to check error messages, which doesn't actually validate message content and will pass for any error longer than the expected message.
Applied to files:
fleetconfig-controller/test/e2e/helper.gofleetconfig-controller/test/e2e/v1beta1_hub_spoke.go
📚 Learning: 2025-09-25T23:26:18.327Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/cmd/manager/manager.go:290-292
Timestamp: 2025-09-25T23:26:18.327Z
Learning: ManagedClusterAddOn hub kubeconfigs in Open Cluster Management use certificate-authority-data (base64-encoded inline CA certificate) rather than certificate-authority (file path reference). This makes the kubeconfig self-contained and eliminates the need to patch CA file paths.
Applied to files:
fleetconfig-controller/test/e2e/helper.gofleetconfig-controller/test/data/fleetconfig-values.yamlfleetconfig-controller/internal/controller/v1beta1/addon.gofleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yamlfleetconfig-controller/test/data/addon-2-cm.yaml
📚 Learning: 2025-10-01T20:56:57.301Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/cmd/manager/manager.go:274-277
Timestamp: 2025-10-01T20:56:57.301Z
Learning: In fleetconfig-controller/cmd/manager/manager.go, the hub kubeconfig read by getHubRestConfig() is auto-generated and mounted with a consistent format, not user-supplied. The blanket string replacement for tls.crt and tls.key paths is suitable for this controlled environment.
Applied to files:
fleetconfig-controller/test/e2e/helper.gofleetconfig-controller/test/data/fleetconfig-values.yaml
📚 Learning: 2025-09-22T19:26:11.020Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 59
File: fleetconfig-controller/test/data/fleetconfig-v1alpha1.yaml:47-53
Timestamp: 2025-09-22T19:26:11.020Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller tests, the kubeconfigKey is intentionally set to "value" in test fixtures (fleetconfig-v1alpha1.yaml, fleetconfig-values.yaml) because that's how the test harness provisions the kubeconfig secret during test setup. This differs from the chart default of "kubeconfig" but is correct for the test environment.
Applied to files:
fleetconfig-controller/test/data/fleetconfig-values.yaml
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: The waitForAddonManifestWorksCleanup function is called specifically during spoke cleanup/unjoin operations, only when addons were previously enabled, and after addon disable operations. In this controlled context, checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks since other ManifestWorks would have been handled earlier in the cleanup flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/addon.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: In the fleetconfig-controller addon cleanup flow, waitForAddonManifestWorksCleanup is only called in contexts where non-addon ManifestWorks have already been handled, so checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-25T23:18:41.573Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.573Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
🧬 Code graph analysis (4)
fleetconfig-controller/test/e2e/helper.go (1)
fleetconfig-controller/test/utils/utils.go (2)
WarnError(72-75)Info(66-69)
fleetconfig-controller/internal/controller/v1beta1/addon.go (5)
fleetconfig-controller/api/v1beta1/constants.go (2)
ManagedBySelector(208-208)AnnotationAddOnDeploymentConfigHash(169-169)fleetconfig-controller/api/v1beta1/spoke_types.go (1)
AddOn(278-291)fleetconfig-controller/internal/hash/hash.go (1)
ComputeHash(11-22)fleetconfig-controller/internal/args/args.go (1)
SanitizeArgs(82-84)fleetconfig-controller/internal/exec/exec.go (1)
CmdWithLogs(18-54)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (3)
fleetconfig-controller/api/v1beta1/constants.go (7)
AddonsConfigured(22-22)InstanceTypeUnified(89-89)FCCAddOnName(116-116)ControllerNamespaceEnvVar(107-107)HubNamespaceEnvVar(104-104)SpokeNamespaceEnvVar(101-101)PurgeAgentNamespaceEnvVar(113-113)fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
SpokeReconciler(52-58)fleetconfig-controller/api/v1beta1/spoke_types.go (3)
Spoke(319-333)HubRef(119-127)CleanupConfig(92-116)
fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go (1)
fleetconfig-controller/test/utils/utils.go (1)
WarnError(72-75)
🪛 LanguageTool
fleetconfig-controller/docs/addon-framework.md
[grammar] ~5-~5: Ensure spelling is correct
Context: ...g clusteradm commands, you can define addons in your Hub and Spoke resources, and FC...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~11-~11: Ensure spelling is correct
Context: ...cy Framework) via HubAddOn ## Custom Addons ### How It Works FCC wraps the `clusteradm ...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~22-~22: Ensure spelling is correct
Context: ...configuration ### Configuring a Custom Addon #### Step 1: Create a ConfigMap with Addon Ma...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~55-~55: Ensure spelling is correct
Context: ...ne AddOnConfig in Hub Resource Add the addon configuration to your Hub resource: ``...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~131-~131: Ensure spelling is correct
Context: ...n Hub Addons FCC includes two built-in addons that can be easily enabled without requ...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~165-~165: Ensure spelling is correct
Context: ... Addons on Spoke Clusters Built-in hub addons can be enabled on spoke clusters the sa...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~180-~180: Ensure spelling is correct
Context: ...ork ``` ## Implementation Details The addon creation logic in FCC (found in `intern...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~212-~212: Ensure spelling is correct
Context: ...# For Built-in Hub Addons Built-in hub addons follow a similar pattern but use pre-co...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~216-~216: Ensure spelling is correct
Context: ...ecify explicit versions for your custom addons to ensure predictable deployments. [Sem...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (20)
fleetconfig-controller/hack/.versions.env (1)
3-3: CLUSTERADM version bump looks consistentCLUSTERADM_VERSION=1.1.0 here aligns with the Dockerfile updates; no issues from this file’s perspective.
fleetconfig-controller/build/Dockerfile.devspace (1)
19-19: Devspace image now standardizes on CLUSTERADM 1.1.0ARG CLUSTERADM_VERSION is updated and consistent with the rest of the build tooling.
fleetconfig-controller/build/Dockerfile.gke (1)
40-40: GKE build CLUSTERADM version alignedThe dependencies stage now uses CLUSTERADM_VERSION=1.1.0, matching other build targets.
fleetconfig-controller/build/Dockerfile.eks (1)
41-41: EKS build now uses CLUSTERADM 1.1.0The version bump is consistent with other Dockerfiles and the hack/.versions.env file.
fleetconfig-controller/build/Dockerfile.base (1)
41-41: Base image clusteradm version standardizedBase dependencies now install clusteradm 1.1.0, keeping all images in sync.
fleetconfig-controller/test/data/addon-2-cm.yaml (1)
10-20: Addon test ConfigMap and variable placeholders look correctUsing a shared
test-addonnamespace and adding the secondaddon-variables-testConfigMap withCLUSTER_NAME,FOO, andNEW_KEYmatches the described e2e variable-resolution flow; the embedded multi-doc YAML is well-formed.fleetconfig-controller/test/data/fleetconfig-values.yaml (2)
13-21: Hub test addon manifest extended for variablesThe additional ConfigMap in
manifestswithCLUSTER_NAMEandFOOplaceholders is structurally sound and lines up with the addon variable-resolution tests.
50-56: Spoke deploymentConfig wiring matches addon variable usage
deploymentConfigwithagentInstallNamespace: test-addonandcustomizedVariablesforCLUSTER_NAME/FOOaligns with the hub-side templates and the new AddOnDeploymentConfig API.fleetconfig-controller/api/v1beta1/constants.go (1)
165-170: New deployment config hash annotation constant looks good
AnnotationAddOnDeploymentConfigHashfollows the existing naming pattern and clearly documents its role in tracking add-on deployment configuration revisions.fleetconfig-controller/api/v1beta1/zz_generated.deepcopy.go (1)
25-26: Deep copy forDeploymentConfigis correctly wiredThe new import and
DeploymentConfigdeep‑copy block follow the existing pointer pattern and safely handle the nil case; this keepsAddOncopies isolated whenAddOnDeploymentConfigSpecis present.Also applies to: 38-42
fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go (1)
107-110: Addon variable resolution e2e flow looks consistentThe new tests that (1) assert initial addon variables, (2) update only the FOO variable via the Spoke’s
AddOns[i].DeploymentConfig.CustomizedVariables, and (3) verify behavior after bumping the addon template version align withensureAddonVariablesResolvedandaddonDatainhelper.go. The use ofEventuallyaround both the spec update and the assertions should give the controller enough time to reconcile.Also applies to: 244-288
fleetconfig-controller/test/e2e/helper.go (2)
43-48: Addon variable helpers are cohesive and robustCentralizing the ConfigMap/keys as constants, reusing
addonDatafor namespace/version metadata, and encapsulating the CLUSTER_NAME/FOO/NEW_KEY checks inensureAddonVariablesResolvedgives you clear, retry‑friendly assertions with deterministic logging. The extra NEW_KEY assertion only foraddonIdx == 1cleanly models the v2.0.0 behavior.Also applies to: 68-84, 422-468
493-499: Updated Hub log message improves clarityChanging the warning message to “failed to get Hub” correctly reflects the resource being fetched in
updateHubAddon.fleetconfig-controller/api/v1beta1/spoke_types.go (1)
26-27: AddOnDeploymentConfigfield is well-integratedWiring
DeploymentConfigas*addonv1alpha1.AddOnDeploymentConfigSpecwithomitemptycleanly exposes deployment configuration without breaking existing callers when nil, and matches the generated deepcopy logic. This is a sensible API surface for drivingAddOnDeploymentConfigcreation.Also applies to: 277-291
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (1)
347-365: Using a Spoke deep copy for FCC addon config before enable is a good patternDeep‑copying the Spoke, applying FCC‑specific deployment config via
configureFCCAddOn, and passing the copy tohandleSpokeAddonskeeps the reconciler input immutable while still recordingEnabledAddonsandAddonsConfiguredon the real Spoke. This is a clean separation of transient addon‑enable configuration from persisted status.fleetconfig-controller/docs/addon-framework.md (1)
1-243: Excellent documentation for the addon framework!This documentation is comprehensive and well-structured. It clearly explains:
- The distinction between custom and built-in addons
- Step-by-step setup instructions with working examples
- The new
deploymentConfigfield and its capabilities- Implementation details for developers
- Best practices and troubleshooting guidance
The code examples accurately reflect the CRD structure and the
deploymentConfigexamples on lines 108-115 properly demonstrate the new functionality.Note: The static analysis warnings about "addon" spelling are false positives - both "addon" and "add-on" are acceptable in technical documentation, and your usage is consistent.
fleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yaml (1)
63-300: Well-structured CRD update for deployment configuration!The new
deploymentConfigfield provides comprehensive configuration options for addon deployment. The CRD includes:
- Proper validation rules (DNS-1123 labels, variable name patterns)
- Appropriate defaults (e.g.,
agentInstallNamespacedefaulting toopen-cluster-management-agent-addon)- Correct use of Kubernetes list semantics (map types for unique entries)
- Comprehensive configuration options matching OCM's
AddOnDeploymentConfigSpecThe structure enables fine-grained control over addon deployment including node placement, resource requirements, proxy settings, and image registries.
fleetconfig-controller/internal/controller/v1beta1/addon.go (3)
290-343: Solid hash-based change detection implementation!The approach of computing a hash of the entire
AddOnstruct and storing it in annotations is clean and effective for detecting configuration changes. Key strengths:
- Conservative error handling: if hash computation fails, the addon will be re-enabled (line 324)
- Correct detection of changes via hash comparison (line 337)
- Proper disable-then-enable flow since clusteradm doesn't support in-place updates (lines 339-342)
One aspect to consider: The hash includes the entire
AddOnstruct, so changes toAnnotationsorConfigNamewill also trigger re-enablement. This is likely the desired behavior since annotations affect the MCA resource, but worth documenting if users might be surprised by annotation-only changes causing addon restarts.
405-443: Clean implementation of config hash and deployment config file handling!The code properly:
- Computes config hash per addon with error handling (lines 405-410)
- Only adds hash annotation when computation succeeds (line 417)
- Preserves user annotations while adding the system hash annotation (lines 412-419)
- Creates temporary deployment config files when needed (lines 431-443)
- Handles cleanup via deferred functions (line 435)
One minor note: When processing multiple addons with
DeploymentConfig, all temporary files will exist until the function completes since cleanup is deferred (line 435). This is acceptable since temp files are small and automatically cleaned up, but consider usingdefer cleanup()inside a function closure if memory usage becomes a concern for large addon lists.
466-495: Well-implemented temporary config file generation!The
createAddonDeploymentConfigFilefunction properly:
- Constructs a valid
AddOnDeploymentConfigresource with correct TypeMeta and ObjectMeta (lines 470-480)- Uses the managed cluster's namespace (
spoke.Name) for the config resource, following OCM conventions- Marshals to YAML using the appropriate library that handles Kubernetes types (line 483)
- Returns a cleanup function following the standard defer pattern (line 494)
- Provides comprehensive error handling at each step
The function integrates cleanly with the existing temporary file management utilities.
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
fleetconfig-controller/internal/controller/v1beta1/addon.go (1)
398-403: Same hash computation error issue as in handleSpokeAddons.This has the same problem as noted at lines 320-325 in
handleSpokeAddons. When hash computation fails,configHash = ""is set, and the addon will be enabled with an empty hash annotation. On the next reconciliation, the empty hash won't match the newly computed hash, triggering unnecessary re-enable.Consider the same fix: fail fast or skip the addon when hash computation fails.
🧹 Nitpick comments (3)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (2)
359-363: Consider setting AddonsConfigured condition even when no addons are enabled.The condition is only set to True when
len(enabledAddons) > 0. If a user explicitly removes all addons from the spec (settingAddOns: []), the condition won't be updated, leaving it in whatever state it was previously. This could mislead users about the actual addon state.Consider setting the condition in all cases:
- if len(enabledAddons) > 0 { - spoke.SetConditions(true, v1beta1.NewCondition( - v1beta1.AddonsConfigured, v1beta1.AddonsConfigured, metav1.ConditionTrue, metav1.ConditionTrue, - )) - } + if len(enabledAddons) > 0 { + spoke.SetConditions(true, v1beta1.NewCondition( + v1beta1.AddonsConfigured, v1beta1.AddonsConfigured, metav1.ConditionTrue, metav1.ConditionTrue, + )) + } else { + spoke.SetConditions(true, v1beta1.NewCondition( + "No addons configured", v1beta1.AddonsConfigured, metav1.ConditionFalse, metav1.ConditionTrue, + )) + }
721-724: Reconsider AddonsConfigured condition state after addon deletion.Similar to the cleanup path at lines 674-677, this sets
AddonsConfiguredtoTrueimmediately after successfully deleting the FCC agent addon (line 715). Since the addon is being removed, setting the condition toFalsewith a message like "Addons disabled" would more accurately reflect the cluster state.fleetconfig-controller/internal/controller/v1beta1/addon.go (1)
426-433: Multiple deferred cleanups are safe but consider noting complexity.The cleanup function from
createAddonDeploymentConfigFileis deferred inside the loop (line 428). If multiple addons haveDeploymentConfig, multiple deferred cleanups will accumulate and execute in LIFO order when the function returns. This is safe and correct, but could be less obvious to future maintainers.The current implementation is fine, but if clarity is preferred, you could collect cleanup functions and call them explicitly after the loop.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
fleetconfig-controller/internal/controller/v1beta1/addon.go(6 hunks)fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go(4 hunks)
🧰 Additional context used
🧠 Learnings (11)
📓 Common learnings
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/cmd/manager/manager.go:290-292
Timestamp: 2025-09-25T23:26:18.327Z
Learning: ManagedClusterAddOn hub kubeconfigs in Open Cluster Management use certificate-authority-data (base64-encoded inline CA certificate) rather than certificate-authority (file path reference). This makes the kubeconfig self-contained and eliminates the need to patch CA file paths.
📚 Learning: 2025-09-25T23:18:41.573Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.573Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-26T04:15:34.475Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/internal/webhook/v1beta1/validation_test.go:699-701
Timestamp: 2025-09-26T04:15:34.475Z
Learning: In fleetconfig-controller/internal/webhook/v1beta1/validation_test.go, the TestValidateAddonUniqueness test has flawed assertion logic at lines 763-769 that uses `if len(err.Error()) >= len(expectedMsg)` for partial matching, which causes false positives when actual error messages are longer than expected messages, allowing tests to pass with completely incorrect expected error messages.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.gofleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: The waitForAddonManifestWorksCleanup function is called specifically during spoke cleanup/unjoin operations, only when addons were previously enabled, and after addon disable operations. In this controlled context, checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks since other ManifestWorks would have been handled earlier in the cleanup flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.gofleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: In the fleetconfig-controller addon cleanup flow, waitForAddonManifestWorksCleanup is only called in contexts where non-addon ManifestWorks have already been handled, so checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-25T23:26:18.327Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/cmd/manager/manager.go:290-292
Timestamp: 2025-09-25T23:26:18.327Z
Learning: ManagedClusterAddOn hub kubeconfigs in Open Cluster Management use certificate-authority-data (base64-encoded inline CA certificate) rather than certificate-authority (file path reference). This makes the kubeconfig self-contained and eliminates the need to patch CA file paths.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-26T04:13:12.146Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 69
File: fleetconfig-controller/internal/webhook/v1beta1/validation_test.go:719-724
Timestamp: 2025-09-26T04:13:12.146Z
Learning: The TestValidateAddonUniqueness function in fleetconfig-controller/internal/webhook/v1beta1/validation_test.go has flawed assertion logic that uses `len(err.Error()) >= len(expectedMsg)` to check error messages, which doesn't actually validate message content and will pass for any error longer than the expected message.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
Repo: open-cluster-management-io/lab PR: 59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/addon.go
🧬 Code graph analysis (2)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (3)
fleetconfig-controller/api/v1beta1/constants.go (7)
AddonsConfigured(22-22)InstanceTypeUnified(89-89)FCCAddOnName(116-116)ControllerNamespaceEnvVar(107-107)HubNamespaceEnvVar(104-104)SpokeNamespaceEnvVar(101-101)PurgeAgentNamespaceEnvVar(113-113)fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
SpokeReconciler(52-58)fleetconfig-controller/api/v1beta1/spoke_types.go (3)
Spoke(319-333)HubRef(119-127)CleanupConfig(92-116)
fleetconfig-controller/internal/controller/v1beta1/addon.go (5)
fleetconfig-controller/api/v1beta1/constants.go (2)
ManagedBySelector(208-208)AnnotationAddOnDeploymentConfigHash(169-169)fleetconfig-controller/api/v1beta1/spoke_types.go (1)
AddOn(278-291)fleetconfig-controller/internal/hash/hash.go (1)
ComputeHash(11-22)fleetconfig-controller/internal/args/args.go (1)
SanitizeArgs(82-84)fleetconfig-controller/internal/exec/exec.go (1)
CmdWithLogs(18-54)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
- GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
🔇 Additional comments (7)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (3)
368-383: LGTM - Guard logic correctly prevents panic.The fix using
fccIdx := -1initialization and early return when no FCC addon is found correctly addresses the critical issue flagged in the previous review. This ensures the function won't panic on empty or non-matchingspoke.Spec.AddOns.
432-435: Good use of deterministic sorting.Sorting the
CustomizedVariablesby name is essential to ensure the hash computation remains stable across reconciliation loops. Without this, identical configurations could produce different hashes due to map iteration order, causing unnecessary addon re-enables.
674-677: Clarify the AddonsConfigured condition state transition.After calling
handleSpokeAddonsto disable addons (line 668), the code setsAddonsConfiguredtoTrue(lines 674-677). Later, at lines 704-706, it's set toFalseafter waiting for cleanup. This state transition (True → False) after a disable operation seems counterintuitive.Is the intent to signal "addon configuration successfully applied" (even if that configuration is to disable them) before transitioning to "no addons configured"? If so, consider clarifying with a more descriptive message, or consolidate the condition updates to avoid the intermediate True state.
fleetconfig-controller/internal/controller/v1beta1/addon.go (4)
290-302: LGTM - Efficient approach to read current addon state.Building a map of ManagedClusterAddOn objects to access their config hash annotations is efficient and avoids N+1 API calls. The error handling gracefully defaults to an empty list, ensuring the reconciliation continues even if the List operation fails.
320-321: Verify: Hashing entire AddOn struct includes user annotations.The hash is computed on the entire
AddOnstruct, which includesConfigName,Annotations, andDeploymentConfig. This means changes to user-supplied annotations (not justDeploymentConfig) will trigger addon re-enable.If this is intentional (to detect any addon configuration change), document it. If only
DeploymentConfigchanges should trigger re-enable, consider hashing only that field:configHash, err := hash.ComputeHash(addon.DeploymentConfig)
353-363: LGTM - Error handling correctly preserves state on disable failure.The code now correctly returns
enabledAddonsunchanged whenhandleAddonDisablefails (lines 354-356), accurately reflecting that the cluster state hasn't changed. Only after successful disable does it update the list (lines 358-363). This addresses the issue flagged in the previous review.
459-488: LGTM - Clean implementation of config file generation.The function properly constructs an
AddOnDeploymentConfigresource with correct TypeMeta and ObjectMeta, marshals it to YAML, and writes to a temporary file with proper cleanup handling. The returned cleanup function follows the established pattern in the codebase.
|
/lgtm |
95e37c1
into
open-cluster-management-io:main
Spoke.SpecSpoke.Spec.AddOn.InstallNamespacefield, replaced byAddOnDeploymentConfigSpec.AgentInstallNamespace--config-fileflag toclusteradm addon enableSummary by CodeRabbit
New Features
Documentation
Chores
Tests
✏️ Tip: You can customize this high-level summary in your review settings.