Skip to content

Conversation

@arturshadnik
Copy link
Member

@arturshadnik arturshadnik commented Sep 25, 2025

Description

This PR introduces a new mode of operation "FleetConfig controller as an AddOn". In this mode, after a Spoke has been successfully joined to the Hub, a fleetconfig-controller-agent will be installed in the spoke cluster as an OCM addon. Once installed, the agent will take over managing the cluster for all upgrade and cleanup activities. This allows for decentralized management of Spokes, where the control plane is used only for configuration and addon management. The benefits of this mode are:

  • Reduced load on the hub cluster. After the join, the spoke's agent does most of the heavy lifting.
  • Improved security. The hub cluster no longer needs to maintain a long-lived spoke Kubeconfig secret. After the join, the secret can be safely deleted.
  • Improved support for mutli-clusters with limited network connectivity. Since the hub only needs to have direct network connectivity to the spoke during the initial join, this architecture is more tolerant to network partitions. If the agent cannot reach the hub to fetch updates, it will simply requeue until connectivity is restored.

AddOn mode is optional, and can be disabled by setting addonMode: false in the Helm values during install. When disabled, FleetConfig controller will operate exactly as before, with the hub's fleetconfig-controller-manager running in unified mode and handling the entire Spoke lifecycle.

Limitations

At this time, AddOn mode is not supported for EKS hubs. See open-cluster-management-io/ocm#1052

Testing

Dev testing

The following scenarios have been validated:

  • Single KinD cluster running as a hub-as-spoke (both modes)
  • 3 KinD clusters, 2 dedicated spokes. Each of the Spoke resources in a different namespace, Hub in a dedicated namespace. (both modes)
  • 2 KinD clusters, hub-as-spoke and spoke. Intermittently dropping all network traffic between the containers for varied lengths of time to simulate an airgap. (addon mode only)
  • 2 kubeadm clusters running on AWS IaaS. (addon mode only)
  • 2 GKE clusters. (addon mode only)

For each scenario, the passing criteria was:

  • Spokes reach "Running" status
  • Updates to spoke.spec.klusterlet.values result in a successful upgrade of Klusterlet
  • Spoke's Kubeconfig secret deleted with no interruption to operations (addon mode only)
  • Deleting a Spoke successfully uninstalls all addons, removes all OCM resources (including FCC agent when in addon mode) from the spoke cluster, and cleans up all resources related to the Spoke from the hub cluster

Automated testing

Additional assertions have been added to the E2E suite to ensure that the kubeconfig secret can be deleted, Klusterlet upgrades succeed, and no resources become orphaned

Summary by CodeRabbit

  • New Features

    • Hub/Spoke modes and instance types, addonMode, agent AddOn manifests and CRDs, Spoke.cleanupConfig, hub-as-spoke flow, dev/debug hub‑spoke scripts, ResourceWatcher, hub/spoke startup options.
  • Bug Fixes

    • Stronger addon validation and uniqueness checks; new status conditions (PivotComplete, KlusterletSynced, HubUpgradeFailed); improved join/upgrade/cleanup and addon binding sequencing.
  • Documentation

    • README, charts, smoketests updated; added Spoke reconciler walkthrough and Hub/Spoke e2e updates.
  • Chores

    • Helm/devspace/scripts, Docker build context, CRD install automation, deepcopy, arg sanitization, tests updated.

Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
@coderabbitai
Copy link

coderabbitai bot commented Sep 25, 2025

Walkthrough

Split runtime into Hub and Spoke modes; add Hub/Spoke APIs, Spoke CleanupConfig, new addon CRDs/manifests, manager factories, webhook/validation updates, hub‑spoke controllers and reconciliation flows, Helm/chart and devspace changes, tests, docs, and utility helpers (arg sanitization, ResourceWatcher).

Changes

Cohort / File(s) Summary
Manager & startup
fleetconfig-controller/cmd/main.go, fleetconfig-controller/cmd/manager/manager.go
Introduce exported manager.Options (flags + InstanceType), add ForHub/ForSpoke entrypoints, centralize manager/webhook/TLS/server setup and hub kubeconfig handling.
API types & deepcopy
fleetconfig-controller/api/v1beta1/constants.go, .../spoke_types.go, .../zz_generated.deepcopy.go
Add Hub/Spoke/addon/finalizer/env constants and supported lists; add CleanupConfig to SpokeSpec with helpers and deepcopy; remove Klusterlet.purgeOperator.
Controllers (hub & spoke)
internal/controller/v1beta1/{spoke_controller.go,spoke_handler.go,addon.go,hub_controller.go,constants.go,common.go}
Make reconciliation instance-type aware (manager/agent/unified), hub-centric finalizers/cleanup, join/accept/upgrade/unjoin flows, token/hubMeta plumbing, klusterlet value merging, FCC MCA patching/bindings, new SetupWithManagerForHub/ForSpoke and HubUpgradeFailed condition.
Webhook & validation
internal/webhook/v1beta1/{spoke_webhook.go,validation.go,validation_test.go,webhook_suite_test.go}
Inject instanceType into webhook/validator; convert addon validation to a validator method; add addon uniqueness/version checks, centralized error constants; update tests.
Addons CRDs & Helm addon manifests
charts/.../crds/0000_00_*, 0000_01_*, 0000_02_*, 0000_03_*, charts/.../templates/ocm/fcc-addon/*
Add CRDs: ClusterManagementAddOn, ManagedClusterAddOn, AddOnDeploymentConfig, AddOnTemplate; add AddOnTemplate/ClusterManagementAddOn Helm manifests and RBAC/ClusterRoleBinding templates for agent.
Helm charts & helpers
charts/fleetconfig-controller/templates/*, charts/.../_helpers.tpl, charts/.../values.yaml, charts/.../README.md
Add addonMode, provider/image helpers, guard webhook blocks, add CONTROLLER_NAMESPACE/CLUSTER_ROLE_NAME envs, add spoke cleanupConfig values, bump image tag and docs.
Chart templates & fleetconfig CRD
charts/.../templates/fleetconfig.yaml, crds/fleetconfig.open-cluster-management.io_spokes.yaml
Add cleanupConfig under Spoke.spec (purgeKlusterletOperator, purgeKubeconfigSecret, purgeAgentNamespace); remove Klusterlet.purgeOperator; update CRD schema/validation.
Devspace & dev scripts
devspace.yaml, devspace-start-hub.sh, devspace-start-spoke.sh, config/devspace/hub/manager.yaml, config/devspace/spoke/manager.yaml, hack/dev/spoke.yaml
Add hub/spoke dev profiles, HUB_PORT/SPOKE_PORT and ADDON_MODE vars, start/debug scripts and launch configs; expose addonMode to dev deployments; add hub/spoke manifests and sync excludes.
Build Dockerfiles
build/Dockerfile.{base,devspace,eks,gke}
Broaden builder COPY to include entire cmd/ directory instead of single cmd/main.go.
CRD install script & chart hooks
hack/install_crds.sh, charts/.../templates/ocm/fcc-addon/*, charts/.../templates/clusterissuer.yaml
Include addon CRDs in install script; add addon RBAC/ClusterRoleBinding and ClusterManagementAddOn manifest; add yaml-language-server directive in clusterissuer template.
E2E, tests & testdata
test/e2e/v1beta1_hub_spoke.go, test/e2e/helper.go, test/data/fleetconfig-values.yaml, internal/*_test.go
Convert e2e flows to Hub/Spoke, add agent addon to spokes, rename test secret, extend provisioning/teardown checks and expected conditions; update unit tests for new conditions and instance type.
Docs & guides
README.md, docs/2-phase-spoke-reconcile.md, docs/smoketests.md, charts/.../README.md
Replace FleetConfig references with Hub/Spoke, add Spoke reconciler walkthrough, update smoketests, quick-start, chart README and dev/debug instructions.
Utility & new package
internal/watch/watch.go, internal/args/{args.go,args_test.go}
Add generic ResourceWatcher (ConditionFunc/HandlerFunc) and arg sanitization helpers SanitizeArgs/SanitizeOutput plus Redacted constant and tests.
Misc & templates
charts/.../templates/deployment.yaml, deployment-dev.yaml, config/devspace/*
Add env vars CONTROLLER_NAMESPACE and CLUSTER_ROLE_NAME, webhook port/arg guards, addonMode templating, and minor template/dev manifest tweaks.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Suggested labels

lgtm

Suggested reviewers

  • ahmad-ibra
  • TylerGillson

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title succinctly conveys that the PR refactors the fleetconfig-controller to operate as an AddOn, matching the primary objective of introducing AddOn mode. It clearly references the key change without unnecessary detail, allowing reviewers to understand the main purpose at a glance. The conventional “refactor:” prefix and concise phrasing makes the intent clear in PR history.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9d3898e and 352b2b3.

📒 Files selected for processing (1)
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (1 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
🧬 Code graph analysis (1)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (9)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
  • SpokeReconciler (52-58)
fleetconfig-controller/api/v1beta1/spoke_types.go (6)
  • Spoke (309-323)
  • KlusterletChartConfig (214-216)
  • HubRef (110-118)
  • CleanupConfig (91-107)
  • AddOn (269-281)
  • Klusterlet (141-197)
fleetconfig-controller/api/v1beta1/constants.go (23)
  • InstanceTypeManager (83-83)
  • PivotComplete (31-31)
  • InstanceTypeUnified (89-89)
  • InstanceTypeAgent (86-86)
  • SupportedInstanceTypes (135-139)
  • KlusterletSynced (34-34)
  • SpokeJoined (28-28)
  • LabelManagedClusterType (151-151)
  • ManagedClusterTypeHubAsSpoke (77-77)
  • FCCAddOnName (116-116)
  • ControllerNamespaceEnvVar (107-107)
  • HubNamespaceEnvVar (104-104)
  • SpokeNamespaceEnvVar (101-101)
  • PurgeAgentNamespaceEnvVar (113-113)
  • AddonsConfigured (22-22)
  • HubCleanupPreflightFinalizer (7-7)
  • SpokeCleanupFinalizer (13-13)
  • HubCleanupFinalizer (10-10)
  • CleanupFailed (25-25)
  • OCMSpokeNamespaces (142-146)
  • HubInitialized (19-19)
  • BundleVersionDefault (215-215)
  • BundleVersionLatest (212-212)
fleetconfig-controller/pkg/common/common.go (6)
  • ClusterClient (20-30)
  • AddOnClient (59-69)
  • GetManagedCluster (72-81)
  • UpdateManagedCluster (84-89)
  • WorkClient (46-56)
  • OperatorClient (33-43)
fleetconfig-controller/internal/kube/kube.go (3)
  • KubeconfigFromSecretOrCluster (113-119)
  • RestConfigFromKubeconfig (22-32)
  • RawFromInClusterRestConfig (75-81)
fleetconfig-controller/internal/hash/hash.go (1)
  • ComputeHash (11-22)
fleetconfig-controller/internal/args/args.go (4)
  • PrepareResources (44-67)
  • PrepareKubeconfig (27-41)
  • SanitizeArgs (82-84)
  • Redacted (16-16)
fleetconfig-controller/internal/exec/exec.go (1)
  • CmdWithLogs (18-54)
fleetconfig-controller/internal/version/version.go (2)
  • LowestBundleVersion (15-46)
  • Normalize (49-55)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
@arturshadnik arturshadnik marked this pull request as ready for review September 25, 2025 21:30
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (9)
fleetconfig-controller/build/Dockerfile.gke (2)

88-88: Fix: apk del --no-cache is invalid and will fail the build

apk del does not support --no-cache. This RUN will error on Alpine.

Apply this diff:

-RUN apk update && apk del --no-cache bash curl
+RUN apk update && apk del bash curl

67-78: Pin gcloud download URL to GCLOUD_VERSION and default OS

  • Add ARG OS=linux and include ${GCLOUD_VERSION} in the curl URL, e.g.:
    ARG OS=linux
    ARG GCLOUD_VERSION=525.0.0
    …
    && curl -Lo google-cloud-sdk.tar.gz "https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-cli-${GCLOUD_VERSION}-${OS}-${GCLOUD_ARCH}.tar.gz"
  • Verify the correct ARCH token for arm64 tarballs (arm vs arm64) for this GCLOUD_VERSION
fleetconfig-controller/charts/fleetconfig-controller/crds/0000_01_addon.open-cluster-management.io_managedclusteraddons.crd.yaml (1)

1-406: Remove duplicate addon/v1alpha1 CRDs from chart
hack/install_crds.sh syncs all addon/v1alpha1/*.crd.yaml into fleetconfig-controller/charts/fleetconfig-controller/crds; delete 0000_01_addon.open-cluster-management.io_managedclusteraddons.crd.yaml (and its addon/v1alpha1 siblings) to avoid duplicate Helm installs.

fleetconfig-controller/README.md (1)

12-12: Fix spelling of “decentralized”.

The new sentence has a typo; please swap in the correct spelling so the onboarding docs read cleanly.

-      It is used to enable decentalized management of spoke clusters.
+      It is used to enable decentralized management of spoke clusters.
fleetconfig-controller/internal/webhook/v1beta1/validation_test.go (1)

412-431: Align HubRef update test with intended behavior.

The case is now labeled “allowed”, but wantErr is still true, so we continue to assert this change must be rejected. If HubRef updates are meant to be permitted, flip the expectation (and drop the error string); otherwise please revert the rename. Right now the test contradicts itself.

-			wantErr: true,
-			errMsg:  "spoke contains changes which are not allowed; only changes to spec.klusterlet.annotations, spec.klusterlet.values, spec.kubeconfig, spec.addOns, spec.timeout, and spec.logVerbosity are allowed when updating a spoke",
+			wantErr: false,
fleetconfig-controller/internal/webhook/v1beta1/validation.go (1)

363-366: Guard against missing HubRef to avoid invalid client.Get calls

If HubRef is empty (common in hub-as-spoke), cli.Get with empty name/namespace will error. Return a warning early.

Apply this diff:

  // try to get hub, if not present or not ready, log a warning that addons cant be properly validated
  hub := &v1beta1.Hub{}
-err := cli.Get(ctx, types.NamespacedName{Name: newObject.Spec.HubRef.Name, Namespace: newObject.Spec.HubRef.Namespace}, hub)
+if newObject.Spec.HubRef.Name == "" || newObject.Spec.HubRef.Namespace == "" {
+	return admission.Warnings{warnHubNotFound}, nil
+}
+err := cli.Get(ctx, types.NamespacedName{
+	Name:      newObject.Spec.HubRef.Name,
+	Namespace: newObject.Spec.HubRef.Namespace,
+}, hub)
fleetconfig-controller/internal/controller/v1beta1/addon.go (1)

674-701: Count only addon-owned ManifestWorks to avoid false positives and timeouts.

Comparing total count to 0/1 can race with unrelated works. Filter by OwnerReferences.Kind == "ManagedClusterAddOn" when deciding readiness. This reduces flakiness during cleanup.

Apply this diff:

-		// for hub-as-spoke, all addons must be removed.
-		// otherwise, fleetconfig-controller-manager must not be removed.
-		var expectedWorks = 1
-		if isHubAsSpoke {
-			expectedWorks = 0
-		}
-
-		if len(manifestWorks.Items) == expectedWorks {
+		// for hub-as-spoke, all addon works must be removed.
+		// otherwise, fleetconfig-controller-manager addon work must remain.
+		var expectedWorks = 1
+		if isHubAsSpoke {
+			expectedWorks = 0
+		}
+		// count only ManifestWorks owned by ManagedClusterAddOn
+		owned := 0
+		for _, mw := range manifestWorks.Items {
+			if slices.ContainsFunc(mw.OwnerReferences, func(or metav1.OwnerReference) bool { return or.Kind == managedClusterAddOn }) {
+				owned++
+			}
+		}
+		if owned == expectedWorks {
 			logger.V(1).Info("addon manifestWorks cleanup completed", "spokeName", spokeName, "remainingManifestWorks", len(manifestWorks.Items))
 			return true, nil
 		}
 
-		logger.V(3).Info("waiting for addon manifestWorks cleanup",
-			"spokeName", spokeName,
-			"addonManifestWorks", len(manifestWorks.Items))
+		logger.V(3).Info("waiting for addon manifestWorks cleanup",
+			"spokeName", spokeName,
+			"addonManifestWorksOwnedByMCA", owned,
+			"allManifestWorks", len(manifestWorks.Items))

Based on learnings

fleetconfig-controller/test/e2e/helper.go (2)

472-475: Bug: wrong error guard after Create — AlreadyExists should be ignored, not NotFound.

Create returns AlreadyExists, so the current check turns a benign condition into a failure.

Apply:

- err = tc.kClient.Create(tc.ctx, cm)
- if err != nil && !kerrs.IsNotFound(err) {
+ err = tc.kClient.Create(tc.ctx, cm)
+ if err != nil && !kerrs.IsAlreadyExists(err) {
     utils.WarnError(err, "failed to create configmap")
     return err
   }

362-363: Fix WarnError formatting args.

An extra err argument is passed into the format string with only two %s verbs.

Use:

- utils.WarnError(err, "failed to fetch namespace %s from cluster '%s'", err, namespaceName, cluster)
+ utils.WarnError(err, "failed to fetch namespace %s from cluster '%s'", namespaceName, cluster)
🧹 Nitpick comments (19)
fleetconfig-controller/build/Dockerfile.eks (1)

41-49: Optional: add defaults for OS and verify arm64 artifact naming

You rely on ${OS}/${ARCH} for aws-iam-authenticator. Consider defaulting ARG OS=linux and confirm arm64 artifact name is linux_arm64 for v0.7.2.

fleetconfig-controller/hack/install_crds.sh (1)

29-29: LGTM: include OCM addon CRDs in sync

Pulling addon/v1alpha1 CRDs complements existing cluster CRDs. Consider surfacing a note in README about re-running this script when bumping OCM versions.

fleetconfig-controller/config/devspace/hub/manager.yaml (1)

19-26: Optional: add ROLE_NAME here too for parity

If the controller reads ROLE_NAME, mirror the Helm deployment by setting it here as well.

fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml (1)

165-166: Remove trailing blank line

Keeps linters happy.

Apply this diff:

-
fleetconfig-controller/internal/webhook/v1beta1/validation.go (1)

90-92: Align allowed-fields error message with actual allowances (mention spec.kubeconfig)

You already allow Kubeconfig changes but the error text omits it. Add for clarity.

Apply this diff:

-			return errors.New("only changes to spec.apiServer, spec.clusterManager.source.*, spec.hubAddOns, spec.addOnConfigs, spec.logVerbosity, spec.timeout, and spec.registrationAuth are allowed when updating the hub")
+			return errors.New("only changes to spec.apiServer, spec.clusterManager.source.*, spec.hubAddOns, spec.addOnConfigs, spec.logVerbosity, spec.timeout, spec.registrationAuth, and spec.kubeconfig are allowed when updating the hub")
fleetconfig-controller/config/devspace/spoke/manager.yaml (1)

18-56: Harden container securityContext for dev deployments

Set runAsNonRoot and disable privilege escalation to address CKV_K8S_20/23. Safe for dev; adjust if base image requires root.

Apply this diff:

       containers:
         - name: fleetconfig-controller-manager
+          securityContext:
+            allowPrivilegeEscalation: false
+            runAsNonRoot: true
+            capabilities:
+              drop: ["ALL"]
           env:
fleetconfig-controller/internal/controller/v1beta1/constants.go (1)

32-34: Clarify constant naming: value is a resource name, not a Kind

AddOnDeploymentConfigsKind holds the plural resource ("addondeploymentconfigs"), which can confuse readers expecting "AddOnDeploymentConfig". Consider a new, clearer constant and keep the old as a compat alias.

Apply this diff:

 	managedClusterAddOn        = "ManagedClusterAddOn"
-	AddOnDeploymentConfigsKind = "addondeploymentconfigs"
+	// Deprecated: prefer AddOnDeploymentConfigResource
+	AddOnDeploymentConfigsKind = "addondeploymentconfigs"
+	// AddOnDeploymentConfigResource is the plural resource name used in GVRs
+	AddOnDeploymentConfigResource = "addondeploymentconfigs"
fleetconfig-controller/devspace.yaml (1)

58-60: Make kubectl delete idempotent in debug pipelines

Add --ignore-not-found to avoid pipeline failures when the deployment isn’t present.

Apply this diff:

   debug: |-
     run_dependencies --all
     build_images fleetconfig-controller-dev
-    kubectl -n fleetconfig-system delete deployment fleetconfig-controller-manager
+    kubectl -n fleetconfig-system delete deployment fleetconfig-controller-manager --ignore-not-found
     create_deployments debug
     start_dev fleetconfig-controller-dev-hub
   debug-spoke: |-
     run_dependencies --all
     build_images fleetconfig-controller-dev
-    kubectl -n fleetconfig-system delete deployment fleetconfig-controller-manager
+    kubectl -n fleetconfig-system delete deployment fleetconfig-controller-manager --ignore-not-found
     create_deployments debug-spoke
     start_dev fleetconfig-controller-dev-spoke

Also applies to: 62-67

fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go (3)

133-140: Harden the negative test by asserting the denial reason.

The test only expects failure. Assert a substring from the webhook/validation error (e.g., contains "forbidden" or FeatureGates immutable) to avoid false positives from transient errors.


200-235: Optional: factor cleanup checks and use IgnoreNotFound for brevity.

The repeated NotFound branches can be simplified with client.IgnoreNotFound and small helpers to reduce noise and ease future additions.


291-291: Remove noisy debug print.

Avoid printing raw status to stdout in tests. Use utils.Info with context if needed, or remove entirely.

Apply this diff:

-					fmt.Println(hubClone.Status)
+					// debug print removed to reduce noisy test output
fleetconfig-controller/charts/fleetconfig-controller/crds/0000_03_addon.open-cluster-management.io_addontemplates.crd.yaml (1)

82-83: Nit: fix typos in schema descriptions.

Examples:

  • "stratecy" → "strategy"
  • "elligible"/"afer"/"manfiestwork" → "eligible"/"after"/"manifestwork"
  • "resoure" → "resource"
  • "kuberenetes" → "kubernetes"

Purely cosmetic, but helps consumers and generated docs.

Also applies to: 119-123, 289-291, 395-396

fleetconfig-controller/cmd/main.go (1)

100-101: Fix misleading log for unknown cluster type.

The "controller=Spoke" field is incorrect when the type is unknown.

Apply this diff:

-		setupLog.Info("unable to create controller for unknown cluster type", "controller", "Spoke", "clusterType", mOpts.ClusterType, "allowed", apiv1beta1.SupportedClusterTypes)
+		setupLog.Info("unable to create controller for unknown cluster type", "clusterType", mOpts.ClusterType, "allowed", apiv1beta1.SupportedClusterTypes)
fleetconfig-controller/charts/fleetconfig-controller/crds/0000_00_addon.open-cluster-management.io_clustermanagementaddons.crd.yaml (1)

58-66: Nit: fix typos in schema descriptions.

Examples:

  • "filed" → "field"
  • "instaces" → "instances"

Keeps generated docs clean.

Also applies to: 66-67, 120-121

fleetconfig-controller/api/v1beta1/constants.go (2)

6-8: Fix typo in comment.

Use “that” instead of “than”.

-// HubCleanupPreflightFinalizer is the finalizer for cleanup preflight checks hub cluster's controller instance. Used to signal to the spoke's controller than unjoin can proceed.
+// HubCleanupPreflightFinalizer is the finalizer for cleanup preflight checks by the hub cluster's controller instance. Used to signal to the spoke's controller that unjoin can proceed.

74-107: Correct env-var comments and addon name description.

Minor doc fixes to avoid confusion.

-// HubNamespaceEnvVar is the environment variable containing the namespace of the Spoke resource.
+// HubNamespaceEnvVar is the environment variable containing the namespace of the Hub resource.
 
-// RoleNameEnvVar containing the name of the ClusterRole for fleetconfig-controller-manager.
+// RoleNameEnvVar is the environment variable containing the name of the ClusterRole for fleetconfig-controller-manager.
 
-// FCCAddOnName is the name of the fleetconfig-controller-addon
+// FCCAddOnName is the name of the fleetconfig-controller manager add-on
fleetconfig-controller/build/Dockerfile.devspace (1)

55-55: Broadened COPY is fine; add .dockerignore to keep image lean and caches stable.

Copying the entire cmd/ dir increases build context and can invalidate the build cache frequently. Exclude tests, docs, and other non-build assets via .dockerignore to reduce image size and churn.

fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go (1)

135-136: Make condition assertion resilient to future conditions.

assertSpokeConditions enforces exact length equality, making tests brittle as new conditions are added.

Consider ignoring length and only asserting expected types/statuses:

func assertSpokeConditions(conditions []v1beta1.Condition, expected map[string]metav1.ConditionStatus) error {
	seen := map[string]bool{}
	for _, c := range conditions {
		if want, ok := expected[c.Type]; ok {
			seen[c.Type] = true
			if c.Status != want {
				return fmt.Errorf("condition %s has status %s, expected %s", c.Type, c.Status, want)
			}
		}
	}
	for k := range expected {
		if !seen[k] {
			return fmt.Errorf("expected condition %s not found", k)
		}
	}
	return nil
}
fleetconfig-controller/api/v1beta1/spoke_types.go (1)

101-104: Clarify IsHubAsSpoke semantics; double‑check InCluster implication.

Classifying hub-as-spoke purely by InCluster may mislabel cases where a spoke uses in-cluster credentials but is not the hub cluster. Please confirm the intended contract and document it.

If applicable, tighten or clarify the comment:

-// IsHubAsSpoke returns true if the cluster is a hub-as-spoke. Determined either by name `hub-as-spoke` or an InCluster kubeconfig
+// IsHubAsSpoke returns true if this Spoke represents the hub cluster itself
+// (hub-as-spoke). Determined either by name "hub-as-spoke" or when the Spoke
+// is operated in-cluster by the same cluster as the controller process.
+// Note: If a Spoke runs in-cluster on a non-hub controller, do not set InCluster.
 func (s *Spoke) IsHubAsSpoke() bool {
 	return s.Name == ManagedClusterTypeHubAsSpoke || s.Spec.Kubeconfig.InCluster
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8b2e21d and da104f7.

📒 Files selected for processing (34)
  • fleetconfig-controller/README.md (1 hunks)
  • fleetconfig-controller/api/v1beta1/constants.go (3 hunks)
  • fleetconfig-controller/api/v1beta1/spoke_types.go (1 hunks)
  • fleetconfig-controller/build/Dockerfile.base (1 hunks)
  • fleetconfig-controller/build/Dockerfile.devspace (1 hunks)
  • fleetconfig-controller/build/Dockerfile.eks (1 hunks)
  • fleetconfig-controller/build/Dockerfile.gke (1 hunks)
  • fleetconfig-controller/charts/fleetconfig-controller/crds/0000_00_addon.open-cluster-management.io_clustermanagementaddons.crd.yaml (1 hunks)
  • fleetconfig-controller/charts/fleetconfig-controller/crds/0000_01_addon.open-cluster-management.io_managedclusteraddons.crd.yaml (1 hunks)
  • fleetconfig-controller/charts/fleetconfig-controller/crds/0000_02_addon.open-cluster-management.io_addondeploymentconfigs.crd.yaml (1 hunks)
  • fleetconfig-controller/charts/fleetconfig-controller/crds/0000_03_addon.open-cluster-management.io_addontemplates.crd.yaml (1 hunks)
  • fleetconfig-controller/charts/fleetconfig-controller/templates/clusterissuer.yaml (1 hunks)
  • fleetconfig-controller/charts/fleetconfig-controller/templates/deployment.yaml (2 hunks)
  • fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml (1 hunks)
  • fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/cluster-management-addon.yaml (1 hunks)
  • fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/cluster-role-binding.yaml (1 hunks)
  • fleetconfig-controller/cmd/main.go (2 hunks)
  • fleetconfig-controller/cmd/manager/manager.go (1 hunks)
  • fleetconfig-controller/config/devspace/hub/manager.yaml (1 hunks)
  • fleetconfig-controller/config/devspace/spoke/manager.yaml (1 hunks)
  • fleetconfig-controller/devspace-start-spoke.sh (1 hunks)
  • fleetconfig-controller/devspace.yaml (7 hunks)
  • fleetconfig-controller/hack/install_crds.sh (1 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/addon.go (8 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/common.go (0 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/constants.go (1 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (8 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go (2 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (1 hunks)
  • fleetconfig-controller/internal/webhook/v1beta1/validation.go (2 hunks)
  • fleetconfig-controller/internal/webhook/v1beta1/validation_test.go (1 hunks)
  • fleetconfig-controller/test/data/fleetconfig-values.yaml (1 hunks)
  • fleetconfig-controller/test/e2e/helper.go (2 hunks)
  • fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go (6 hunks)
💤 Files with no reviewable changes (1)
  • fleetconfig-controller/internal/controller/v1beta1/common.go
🧰 Additional context used
🧠 Learnings (13)
📓 Common learnings
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#58
File: fleetconfig-controller/charts/fleetconfig-controller/README.md:155-155
Timestamp: 2025-08-27T21:58:32.141Z
Learning: In the open-cluster-management-io/lab repository, the fleetconfig-controller follows a workflow where chart version bumps (in README.md and values.yaml) are included in PRs before the corresponding Docker image exists. The Docker image is built and pushed automatically via GitHub release workflows after the PR is merged and tagged, making the referenced version available.
📚 Learning: 2025-09-22T19:16:34.109Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/webhook/v1beta1/validation.go:103-121
Timestamp: 2025-09-22T19:16:34.109Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller v1beta1 API, the Klusterlet field in SpokeSpec is defined as a struct value (Klusterlet Klusterlet), not a pointer (*Klusterlet), so direct field access like Klusterlet.Annotations is safe without nil checks. The Klusterlet struct does not contain a Source field.

Applied to files:

  • fleetconfig-controller/test/e2e/helper.go
  • fleetconfig-controller/api/v1beta1/spoke_types.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.

Applied to files:

  • fleetconfig-controller/test/e2e/helper.go
  • fleetconfig-controller/internal/webhook/v1beta1/validation_test.go
  • fleetconfig-controller/internal/webhook/v1beta1/validation.go
  • fleetconfig-controller/devspace.yaml
  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go
  • fleetconfig-controller/api/v1beta1/constants.go
  • fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go
  • fleetconfig-controller/README.md
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-22T18:42:03.404Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/PROJECT:28-31
Timestamp: 2025-09-22T18:42:03.404Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller, the PROJECT file defines multiple API resources with different webhook configurations: FleetConfig v1alpha1 has defaulting: true (requiring MutatingWebhookConfiguration), while Hub and Spoke v1beta1 resources have defaulting: false. MutatingWebhookConfiguration resources in the manifests serve the v1alpha1 FleetConfig, not the v1beta1 Hub/Spoke resources.

Applied to files:

  • fleetconfig-controller/config/devspace/spoke/manager.yaml
  • fleetconfig-controller/test/data/fleetconfig-values.yaml
  • fleetconfig-controller/devspace.yaml
  • fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go
  • fleetconfig-controller/README.md
📚 Learning: 2025-09-22T19:26:11.020Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/test/data/fleetconfig-v1alpha1.yaml:47-53
Timestamp: 2025-09-22T19:26:11.020Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller tests, the kubeconfigKey is intentionally set to "value" in test fixtures (fleetconfig-v1alpha1.yaml, fleetconfig-values.yaml) because that's how the test harness provisions the kubeconfig secret during test setup. This differs from the chart default of "kubeconfig" but is correct for the test environment.

Applied to files:

  • fleetconfig-controller/devspace.yaml
📚 Learning: 2025-08-27T21:58:32.141Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#58
File: fleetconfig-controller/charts/fleetconfig-controller/README.md:155-155
Timestamp: 2025-08-27T21:58:32.141Z
Learning: In the open-cluster-management-io/lab repository, the fleetconfig-controller follows a workflow where chart version bumps (in README.md and values.yaml) are included in PRs before the corresponding Docker image exists. The Docker image is built and pushed automatically via GitHub release workflows after the PR is merged and tagged, making the referenced version available.

Applied to files:

  • fleetconfig-controller/devspace.yaml
  • fleetconfig-controller/config/devspace/hub/manager.yaml
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: The waitForAddonManifestWorksCleanup function is called specifically during spoke cleanup/unjoin operations, only when addons were previously enabled, and after addon disable operations. In this controlled context, checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks since other ManifestWorks would have been handled earlier in the cleanup flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: In the fleetconfig-controller addon cleanup flow, waitForAddonManifestWorksCleanup is only called in contexts where non-addon ManifestWorks have already been handled, so checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-08-22T17:55:52.159Z
Learnt from: TylerGillson
PR: open-cluster-management-io/lab#51
File: fleetconfig-controller/charts/fleetconfig-controller/README.md:155-155
Timestamp: 2025-08-22T17:55:52.159Z
Learning: In the open-cluster-management-io/lab repository, chart versioning for fleetconfig-controller is handled automatically via GitHub release workflows, not through manual version bumps in Chart.yaml during regular PRs.

Applied to files:

  • fleetconfig-controller/charts/fleetconfig-controller/templates/deployment.yaml
🧬 Code graph analysis (9)
fleetconfig-controller/api/v1beta1/spoke_types.go (3)
fleetconfig-controller/api/v1alpha1/fleetconfig_types.go (2)
  • Spoke (350-392)
  • Kubeconfig (301-317)
fleetconfig-controller/api/v1beta1/constants.go (1)
  • ManagedClusterTypeHubAsSpoke (71-71)
fleetconfig-controller/api/v1beta1/common.go (1)
  • Kubeconfig (13-29)
fleetconfig-controller/internal/webhook/v1beta1/validation.go (1)
fleetconfig-controller/api/v1beta1/spoke_types.go (1)
  • AddOn (241-253)
fleetconfig-controller/cmd/main.go (2)
fleetconfig-controller/cmd/manager/manager.go (3)
  • Options (34-46)
  • ForHub (49-143)
  • ForSpoke (146-264)
fleetconfig-controller/api/v1beta1/constants.go (4)
  • SpokeDefaultMaxConcurrentReconciles (148-148)
  • ClusterTypeHub (76-76)
  • ClusterTypeSpoke (79-79)
  • SupportedClusterTypes (110-113)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go (1)
fleetconfig-controller/api/v1beta1/constants.go (2)
  • ClusterTypeHub (76-76)
  • PivotComplete (31-31)
fleetconfig-controller/internal/controller/v1beta1/addon.go (4)
fleetconfig-controller/api/v1beta1/spoke_types.go (1)
  • AddOn (241-253)
fleetconfig-controller/api/v1beta1/constants.go (1)
  • FCCAddOnName (103-103)
fleetconfig-controller/vendor/open-cluster-management.io/api/addon/v1alpha1/register.go (2)
  • GroupName (10-10)
  • Resource (26-28)
fleetconfig-controller/internal/controller/v1beta1/constants.go (1)
  • AddOnDeploymentConfigsKind (33-33)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (2)
fleetconfig-controller/api/v1beta1/constants.go (8)
  • ClusterTypeHub (76-76)
  • HubCleanupFinalizer (10-10)
  • HubCleanupPreflightFinalizer (7-7)
  • SpokeCleanupFinalizer (13-13)
  • ClusterTypeSpoke (79-79)
  • SupportedClusterTypes (110-113)
  • PivotComplete (31-31)
  • SpokeNameEnvVar (88-88)
fleetconfig-controller/cmd/manager/manager.go (1)
  • Options (34-46)
fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go (2)
fleetconfig-controller/test/utils/utils.go (3)
  • DevspaceRunPipeline (78-97)
  • Info (66-69)
  • WarnError (72-75)
fleetconfig-controller/api/v1beta1/constants.go (1)
  • Deleting (57-57)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (9)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
  • SpokeReconciler (46-52)
fleetconfig-controller/api/v1beta1/spoke_types.go (5)
  • Spoke (281-295)
  • KlusterletChartConfig (186-188)
  • HubRef (86-94)
  • AddOn (241-253)
  • Klusterlet (107-169)
fleetconfig-controller/api/v1beta1/constants.go (19)
  • ClusterTypeHub (76-76)
  • ClusterTypeSpoke (79-79)
  • SupportedClusterTypes (110-113)
  • SpokeJoined (28-28)
  • LabelManagedClusterType (118-118)
  • ManagedClusterTypeHubAsSpoke (71-71)
  • ControllerNamespaceEnvVar (97-97)
  • HubNamespaceEnvVar (94-94)
  • SpokeNamespaceEnvVar (91-91)
  • AddonsConfigured (22-22)
  • RoleNameEnvVar (100-100)
  • DefaultFCCManagerRole (106-106)
  • FCCAddOnName (103-103)
  • PivotComplete (31-31)
  • HubCleanupPreflightFinalizer (7-7)
  • SpokeCleanupFinalizer (13-13)
  • HubCleanupFinalizer (10-10)
  • HubInitialized (19-19)
  • AWSIRSARegistrationDriver (130-130)
fleetconfig-controller/pkg/common/common.go (6)
  • ClusterClient (20-30)
  • AddOnClient (59-69)
  • GetManagedCluster (72-81)
  • UpdateManagedCluster (84-89)
  • WorkClient (46-56)
  • OperatorClient (33-43)
fleetconfig-controller/internal/kube/kube.go (3)
  • KubeconfigFromSecretOrCluster (113-119)
  • RawFromInClusterRestConfig (75-81)
  • RestConfigFromKubeconfig (22-32)
fleetconfig-controller/internal/hash/hash.go (1)
  • ComputeHash (11-22)
fleetconfig-controller/internal/args/args.go (2)
  • PrepareResources (32-55)
  • PrepareKubeconfig (15-29)
fleetconfig-controller/internal/exec/exec.go (1)
  • CmdWithLogs (18-54)
fleetconfig-controller/internal/version/version.go (2)
  • LowestBundleVersion (15-46)
  • Normalize (49-55)
fleetconfig-controller/cmd/manager/manager.go (8)
fleetconfig-controller/internal/controller/v1alpha1/fleetconfig_controller.go (1)
  • FleetConfigReconciler (58-62)
fleetconfig-controller/internal/controller/v1beta1/hub_controller.go (1)
  • HubReconciler (55-59)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
  • SpokeReconciler (46-52)
fleetconfig-controller/api/v1beta1/constants.go (5)
  • ClusterTypeSpoke (79-79)
  • SpokeNamespaceEnvVar (91-91)
  • HubNamespaceEnvVar (94-94)
  • HubKubeconfigEnvVar (82-82)
  • DefaultHubKubeconfigPath (85-85)
fleetconfig-controller/api/v1alpha1/fleetconfig_webhook.go (1)
  • SetupFleetConfigWebhookWithManager (38-43)
fleetconfig-controller/internal/webhook/v1beta1/hub_webhook.go (1)
  • SetupHubWebhookWithManager (43-55)
fleetconfig-controller/internal/webhook/v1beta1/spoke_webhook.go (1)
  • SetupSpokeWebhookWithManager (44-56)
fleetconfig-controller/internal/kube/kube.go (1)
  • RestConfigFromKubeconfig (22-32)
🪛 Checkov (3.2.334)
fleetconfig-controller/config/devspace/spoke/manager.yaml

[medium] 1-56: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 1-56: Minimize the admission of root containers

(CKV_K8S_23)

🪛 YAMLlint (1.37.1)
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml

[warning] 165-165: too many blank lines (1 > 0)

(empty-lines)


[error] 19-19: syntax error: expected , but found ''

(syntax)

fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/cluster-role-binding.yaml

[error] 9-9: syntax error: expected , but found ''

(syntax)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (32)
fleetconfig-controller/build/Dockerfile.gke (2)

80-81: Confirm gcloud’s bash dependency post-install

You remove bash from the final image. Some gcloud wrapper scripts are bash-based. Since you only need the gke-gcloud-auth-plugin, this may be fine, but please confirm no code invokes gcloud at runtime.


23-30: LGTM on copying full cmd directory

Broader COPY context is appropriate given the main refactor; the build target remains cmd/main.go.

fleetconfig-controller/build/Dockerfile.eks (1)

24-31: LGTM: broadened COPY context is consistent with the refactor

Copying the entire cmd/ dir aligns with multi-entrypoint patterns and keeps the build target unchanged.

fleetconfig-controller/charts/fleetconfig-controller/templates/clusterissuer.yaml (1)

2-2: LGTM: non-functional YAML language server hint

The added comment improves editor UX and doesn’t affect rendered manifests.

fleetconfig-controller/charts/fleetconfig-controller/templates/deployment.yaml (2)

47-53: Verify ROLE_NAME aligns with rendered RBAC object names

Ensure a Role/ClusterRole named {{ include "chart.fullname" . }}-manager-role exists; otherwise this env var is misleading or unused.


71-81: LGTM: gating TLS mounts/volumes on webhooks

Mounts and secret volumes are correctly included only when webhooks are enabled.

fleetconfig-controller/test/data/fleetconfig-values.yaml (1)

41-43: LGTM: add the fleetconfig-controller-manager addon

Matches the new addon wiring introduced in the PR. Ensure AddOnTemplate/ClusterManagementAddOn resources use the same name for consistency.

fleetconfig-controller/config/devspace/hub/manager.yaml (1)

26-30: LGTM: add CONTROLLER_NAMESPACE for parity with chart

Keeps devspace config aligned with the Helm deployment.

fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/cluster-role-binding.yaml (1)

11-13: Verify addon-manager service account exists in the hub namespace

ClusterRoleBinding targets ServiceAccount addon-manager-controller-sa in open-cluster-management-hub. Ensure it is created by OCM on the hub or provision it in this chart.

If needed, I can add a ServiceAccount manifest to this chart or reference the upstream one. Do you want me to draft it?

fleetconfig-controller/charts/fleetconfig-controller/crds/0000_02_addon.open-cluster-management.io_addondeploymentconfigs.crd.yaml (1)

1-280: LGTM: CRD schema looks consistent with OCM v1alpha1 AddOnDeploymentConfig

Fields, list-types, and patterns align with expectations.

fleetconfig-controller/devspace.yaml (1)

163-188: Nice split between hub and spoke dev targets

Clear terminal commands and selective syncs reduce churn and speed up inner loop.

fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go (2)

27-27: LGTM: apps/v1 import used correctly for addon agent checks.


64-69: LGTM: devspace pipelines sequence reads well (load to spoke, deploy on hub).

fleetconfig-controller/api/v1beta1/constants.go (3)

12-13: LGTM: clarified Spoke finalizer comment.


27-31: LGTM: new PivotComplete condition.


109-114: LGTM: SupportedClusterTypes exported list.

fleetconfig-controller/cmd/manager/manager.go (2)

61-65: LGTM: HTTP/2 hardening via TLS NextProtos.


198-229: Confirm cache scoping matches RBAC of the add-on.

The ByObject cache scopes only Spoke (spokeNamespace), ConfigMap (spokeNamespace), and Hub (hubNamespace). Ensure the add-on’s permissions match these watches; otherwise reconciles will stall.

If helpful, I can scan the Helm templates/ClusterRole for matching list/watch verbs in those namespaces.

fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (2)

94-107: LGTM: finalizer choreography per cluster type.

Hub adds preflight + hub cleanup (and spoke cleanup for hub-as-spoke); Spoke adds spoke cleanup.

Also applies to: 109-116


151-153: LGTM: initialize PivotComplete with WantStatus=true.

fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (8)

62-87: Nil klusterlet values are intentionally supported.

Downstream paths handle nil (values file omitted; hash uses ZeroNil). No changes needed.

Based on learnings


182-211: Ensure controller namespace is set for AddOnDeploymentConfig.

If CONTROLLER_NAMESPACE is empty, the agent install namespace becomes empty. Consider validating and surfacing a clear error.

I can add a guard here similar to the manager env var checks if you want.


316-344: LGTM: safety checks before unjoin (manifestWorks and add-ons).


387-399: LGTM: CSR cleanup filtered by cluster-name label and suffix trimming.


446-508: LGTM: spoke-side unjoin and self-cleanup, with NotFound-safe deletes.


949-955: Confirm mergo semantics match intent.

With mergo.WithOverride, only non-empty values from specInterface override fromInterface, which aligns with “ignore zero-values from spec”.

If the intent is to also allow empty values to clear defaults, switch to WithOverwriteWithEmptyValue (if available in the used mergo version).


972-991: LGTM: values file emission is conditional and cleaned up.


855-857: Nil check on HubSpec.ClusterManager is valid
HubSpec.ClusterManager is defined as a pointer in v1beta1 ( *ClusterManager ); the nil check compiles and is required. No change needed.

Likely an incorrect or invalid review comment.

fleetconfig-controller/build/Dockerfile.base (1)

25-25: LGTM: copying cmd/ aligns with the build target.

No functional concerns; the build still targets cmd/main.go.

fleetconfig-controller/test/e2e/helper.go (2)

490-491: Confirm PivotComplete applies to all spokes.

You require PivotComplete=True for every spoke. Verify non hub-as-spoke clusters also set this condition; otherwise gate this expectation per spoke type.


57-58: No change needed: v1beta1fccAddOnAgentNN is used in fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go (line 202).

Likely an incorrect or invalid review comment.

fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go (1)

74-78: LGTM: initializes ClusterType=hub for the hub-as-spoke path.

This matches the new reconciler field and test intent.

Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
fleetconfig-controller/charts/fleetconfig-controller/templates/deployment.yaml (1)

60-68: Keep the ports: key when webhooks are disabled

When admissionWebhooks.enabled is false, the template renders - containerPort: … without a preceding ports: key, producing invalid YAML. Always emit the list key and add the webhook port conditionally.

-      {{- if .Values.admissionWebhooks.enabled }}
-        ports:
-        - containerPort: {{ .Values.webhookService.port }}
-          name: webhook-server
-          protocol: TCP
-        {{- end }}
+        ports:
+        {{- if .Values.admissionWebhooks.enabled }}
+        - containerPort: {{ .Values.webhookService.port }}
+          name: webhook-server
+          protocol: TCP
+        {{- end }}
         - containerPort: {{ .Values.healthCheck.port }}
           name: healthz
           protocol: TCP
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml (1)

164-165: Remove trailing blank line.

YAMLlint warns about extra blank line at EOF.

Apply this diff:

-              clusterRoleName: {{ include "chart.fullname" . }}-manager-role
-
+              clusterRoleName: {{ printf "%s-manager-role" (include "chart.fullname" .) | quote }}

As per static analysis hints

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between da104f7 and 282d2f6.

📒 Files selected for processing (9)
  • fleetconfig-controller/charts/fleetconfig-controller/templates/deployment.yaml (2 hunks)
  • fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml (1 hunks)
  • fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/cluster-management-addon.yaml (1 hunks)
  • fleetconfig-controller/cmd/main.go (2 hunks)
  • fleetconfig-controller/cmd/manager/manager.go (1 hunks)
  • fleetconfig-controller/devspace-start-spoke.sh (1 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/addon.go (8 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (8 hunks)
  • fleetconfig-controller/internal/webhook/v1beta1/validation.go (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/cluster-management-addon.yaml
  • fleetconfig-controller/internal/controller/v1beta1/addon.go
  • fleetconfig-controller/devspace-start-spoke.sh
  • fleetconfig-controller/cmd/main.go
🧰 Additional context used
🧠 Learnings (6)
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.

Applied to files:

  • fleetconfig-controller/internal/webhook/v1beta1/validation.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
📚 Learning: 2025-09-22T19:16:34.109Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/webhook/v1beta1/validation.go:103-121
Timestamp: 2025-09-22T19:16:34.109Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller v1beta1 API, the Klusterlet field in SpokeSpec is defined as a struct value (Klusterlet Klusterlet), not a pointer (*Klusterlet), so direct field access like Klusterlet.Annotations is safe without nil checks. The Klusterlet struct does not contain a Source field.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
🧬 Code graph analysis (2)
fleetconfig-controller/internal/webhook/v1beta1/validation.go (2)
fleetconfig-controller/api/v1beta1/spoke_types.go (1)
  • AddOn (241-253)
fleetconfig-controller/api/v1beta1/constants.go (1)
  • FCCAddOnName (103-103)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (2)
fleetconfig-controller/api/v1beta1/constants.go (8)
  • ClusterTypeHub (76-76)
  • HubCleanupFinalizer (10-10)
  • HubCleanupPreflightFinalizer (7-7)
  • SpokeCleanupFinalizer (13-13)
  • ClusterTypeSpoke (79-79)
  • SupportedClusterTypes (110-113)
  • PivotComplete (31-31)
  • SpokeNameEnvVar (88-88)
fleetconfig-controller/cmd/manager/manager.go (1)
  • Options (33-45)
🪛 YAMLlint (1.37.1)
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml

[warning] 164-164: too many blank lines (1 > 0)

(empty-lines)


[error] 19-19: syntax error: expected , but found ''

(syntax)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (6)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)

249-257: Guard empty CLUSTER_NAME before scoping the controller

If CLUSTER_NAME is unset, the predicate matches nothing and the controller never reconciles. Validate it up front so startup fails with a clear error instead of silently disabling the controller.

-func (r *SpokeReconciler) SetupWithManagerForSpoke(mgr ctrl.Manager) error {
-	spokeName := os.Getenv(v1beta1.SpokeNameEnvVar) // we know this is set, because the mgr setup would have failed otherwise
+func (r *SpokeReconciler) SetupWithManagerForSpoke(mgr ctrl.Manager) error {
+	spokeName := strings.TrimSpace(os.Getenv(v1beta1.SpokeNameEnvVar))
+	if spokeName == "" {
+		return fmt.Errorf("env %s must be set to the Spoke name", v1beta1.SpokeNameEnvVar)
+	}
 	return ctrl.NewControllerManagedBy(mgr).
 		For(&v1beta1.Spoke{},
 			builder.WithPredicates(predicate.NewPredicateFuncs(

(Add strings to the imports.)

fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml (5)

84-88: Good fix: removed namespace from ClusterRole metadata.

Cluster-scoped resource no longer sets metadata.namespace.


149-153: Good fix: removed invalid namespace from RoleBinding.roleRef.

roleRef now contains only apiGroup, kind, name.


19-19: Quote Helm-composed names to fix YAML parse error (YAMLlint).

Unquoted concatenations trigger a parse error at Line 19 and risk elsewhere. Quote via printf + quote.

Apply this diff:

-          name: {{ include "chart.fullname" . }}-manager
+          name: {{ printf "%s-manager" (include "chart.fullname" .) | quote }}
@@
-          name: {{ include "chart.fullname" . }}-manager-spoke-rolebinding
+          name: {{ printf "%s-manager-spoke-rolebinding" (include "chart.fullname" .) | quote }}
@@
-          name: {{ include "chart.fullname" . }}-manager-spoke-role
+          name: {{ printf "%s-manager-spoke-role" (include "chart.fullname" .) | quote }}
@@
-          name: {{ include "chart.fullname" . }}-leader-election-role
+          name: {{ printf "%s-leader-election-role" (include "chart.fullname" .) | quote }}
@@
-          name: {{ include "chart.fullname" . }}-leader-election-rolebinding
+          name: {{ printf "%s-leader-election-rolebinding" (include "chart.fullname" .) | quote }}
@@
-          name: {{ include "chart.fullname" . }}-leader-election-role
+          name: {{ printf "%s-leader-election-role" (include "chart.fullname" .) | quote }}
@@
-              clusterRoleName: {{ include "chart.fullname" . }}-manager-role
+              clusterRoleName: {{ printf "%s-manager-role" (include "chart.fullname" .) | quote }}

As per static analysis hints

Also applies to: 116-116, 120-120, 128-128, 145-145, 152-152, 164-164


163-164: No missing ClusterRole: The templated ClusterRole {{ include "chart.fullname" . }}-manager-role is defined in charts/fleetconfig-controller/templates/rbac/manager-rbac.yaml.


89-92: Confirm necessity of namespace delete permission. Namespace deletion is required by cleanup logic in internal/controller/v1alpha1/hub.go and internal/controller/v1beta1/spoke_handler.go and is gated by the CR’s purge flags—permission should remain.

Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
fleetconfig-controller/internal/webhook/v1beta1/validation_test.go (1)

412-431: Fix test name vs expectation mismatch for HubRef change.

The case is named “allowed - HubRef change” but asserts wantErr=true with a disallowed-fields message. Either:

  • Rename to “disallowed - HubRef change”, or
  • Allow HubRef changes in allowSpokeUpdate and set wantErr=false.

Given current code, rename is the minimal fix.

-        name: "allowed - HubRef change",
+        name: "disallowed - HubRef change",
...
-        wantErr: true,
-        errMsg:  "spoke contains changes which are not allowed; only changes to spec.klusterlet.annotations, spec.klusterlet.values, spec.kubeconfig, spec.addOns, spec.timeout, and spec.logVerbosity are allowed when updating a spoke",
+        wantErr: true,
+        errMsg:  "spoke contains changes which are not allowed; only changes to spec.klusterlet.annotations, spec.klusterlet.values, spec.kubeconfig, spec.addOns, spec.timeout, and spec.logVerbosity are allowed when updating a spoke",
🧹 Nitpick comments (5)
fleetconfig-controller/internal/webhook/v1beta1/validation.go (1)

362-376: Optionally short-circuit on existing hard errors to save API calls.

If errs already contains Invalids from addon presence checks, you can return early instead of calling the API for hub lookup.

 	errs := field.ErrorList{}
 
 	if newObject.IsHubAsSpoke() {
 		...
 	}
 
+	if len(errs) > 0 {
+		return nil, errs
+	}
+
 	// try to get hub, if not present or not ready, log a warning that addons cant be properly validated
fleetconfig-controller/devspace.yaml (2)

163-188: Minor: consider numeric port vars for dev terminal mappings.

DevSpace treats ports as numbers; defining HUB_PORT as an int avoids accidental quoting issues across tooling.

-  HUB_PORT:
-    value: "2344"
+  HUB_PORT:
+    value: 2344

Repeat similarly for SPOKE_PORT.


195-198: Same nit: SPOKE_PORT as number.

See prior suggestion.

-  SPOKE_PORT:
-    value: "2345"
+  SPOKE_PORT:
+    value: 2345
fleetconfig-controller/internal/controller/v1beta1/addon.go (2)

387-396: Handle MCA not-yet-created and ensure idempotent patching.

Right after enable, MCA may not exist; current code returns an error. Add a short poll for NotFound and avoid duplicating the config.

-        if a.ConfigName == v1beta1.FCCAddOnName {
-            err = patchFCCMca(ctx, spoke.Name, addonC)
-            if err != nil {
-                enableErrs = append(enableErrs, err)
-                continue
-            }
-        }
+        if a.ConfigName == v1beta1.FCCAddOnName {
+            if err := patchFCCMca(ctx, spoke.Name, addonC); err != nil {
+                enableErrs = append(enableErrs, err)
+                continue
+            }
+        }

And update patchFCCMca as below (see next comment).


676-699: Verify expected ManifestWorks count logic during cleanup.

Switching to expectedWorks=1 for non hub-as-spoke assumes FCC manager MW remains during unjoin. Previous flow (per learnings) treated zero total MW as equivalent to zero addon MW in this context. Confirm this invariant change is intended; otherwise we may time out or under-clean.

Consider filtering to addon-owned ManifestWorks (e.g., by ownerRef kind ManagedClusterAddOn) or making expected count an argument derived from the caller’s intent.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 282d2f6 and d863fed.

📒 Files selected for processing (11)
  • fleetconfig-controller/api/v1beta1/constants.go (3 hunks)
  • fleetconfig-controller/charts/fleetconfig-controller/templates/deployment.yaml (1 hunks)
  • fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml (1 hunks)
  • fleetconfig-controller/cmd/main.go (2 hunks)
  • fleetconfig-controller/cmd/manager/manager.go (1 hunks)
  • fleetconfig-controller/config/devspace/hub/manager.yaml (1 hunks)
  • fleetconfig-controller/devspace.yaml (7 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/addon.go (8 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/constants.go (1 hunks)
  • fleetconfig-controller/internal/webhook/v1beta1/validation.go (3 hunks)
  • fleetconfig-controller/internal/webhook/v1beta1/validation_test.go (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • fleetconfig-controller/charts/fleetconfig-controller/templates/deployment.yaml
  • fleetconfig-controller/cmd/manager/manager.go
🧰 Additional context used
🧠 Learnings (8)
📚 Learning: 2025-08-27T21:58:32.141Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#58
File: fleetconfig-controller/charts/fleetconfig-controller/README.md:155-155
Timestamp: 2025-08-27T21:58:32.141Z
Learning: In the open-cluster-management-io/lab repository, the fleetconfig-controller follows a workflow where chart version bumps (in README.md and values.yaml) are included in PRs before the corresponding Docker image exists. The Docker image is built and pushed automatically via GitHub release workflows after the PR is merged and tagged, making the referenced version available.

Applied to files:

  • fleetconfig-controller/config/devspace/hub/manager.yaml
  • fleetconfig-controller/devspace.yaml
📚 Learning: 2025-09-25T23:18:41.545Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.545Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.

Applied to files:

  • fleetconfig-controller/api/v1beta1/constants.go
  • fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.

Applied to files:

  • fleetconfig-controller/api/v1beta1/constants.go
  • fleetconfig-controller/internal/webhook/v1beta1/validation.go
  • fleetconfig-controller/internal/webhook/v1beta1/validation_test.go
📚 Learning: 2025-09-22T18:42:03.404Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/PROJECT:28-31
Timestamp: 2025-09-22T18:42:03.404Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller, the PROJECT file defines multiple API resources with different webhook configurations: FleetConfig v1alpha1 has defaulting: true (requiring MutatingWebhookConfiguration), while Hub and Spoke v1beta1 resources have defaulting: false. MutatingWebhookConfiguration resources in the manifests serve the v1alpha1 FleetConfig, not the v1beta1 Hub/Spoke resources.

Applied to files:

  • fleetconfig-controller/devspace.yaml
📚 Learning: 2025-09-22T19:26:11.020Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/test/data/fleetconfig-v1alpha1.yaml:47-53
Timestamp: 2025-09-22T19:26:11.020Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller tests, the kubeconfigKey is intentionally set to "value" in test fixtures (fleetconfig-v1alpha1.yaml, fleetconfig-values.yaml) because that's how the test harness provisions the kubeconfig secret during test setup. This differs from the chart default of "kubeconfig" but is correct for the test environment.

Applied to files:

  • fleetconfig-controller/devspace.yaml
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: The waitForAddonManifestWorksCleanup function is called specifically during spoke cleanup/unjoin operations, only when addons were previously enabled, and after addon disable operations. In this controlled context, checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks since other ManifestWorks would have been handled earlier in the cleanup flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: In the fleetconfig-controller addon cleanup flow, waitForAddonManifestWorksCleanup is only called in contexts where non-addon ManifestWorks have already been handled, so checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-25T23:31:11.561Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml:110-112
Timestamp: 2025-09-25T23:31:11.561Z
Learning: The fleetconfig-controller-manager spoke agent requires create/update/patch/delete permissions on CustomResourceDefinitions because `clusteradm upgrade klusterlet` operations need create/update permissions and cleanup operations require delete permissions for proper lifecycle management.

Applied to files:

  • fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml
🧬 Code graph analysis (3)
fleetconfig-controller/internal/webhook/v1beta1/validation.go (3)
fleetconfig-controller/api/v1beta1/spoke_types.go (2)
  • AddOn (241-253)
  • HubRef (86-94)
fleetconfig-controller/api/v1alpha1/fleetconfig_types.go (2)
  • AddOn (430-442)
  • Hub (176-208)
fleetconfig-controller/api/v1beta1/constants.go (2)
  • FCCAddOnName (103-103)
  • HubInitialized (19-19)
fleetconfig-controller/cmd/main.go (2)
fleetconfig-controller/cmd/manager/manager.go (3)
  • Options (33-45)
  • ForHub (73-146)
  • ForSpoke (149-257)
fleetconfig-controller/api/v1beta1/constants.go (4)
  • SpokeDefaultMaxConcurrentReconciles (148-148)
  • ClusterTypeHub (76-76)
  • ClusterTypeSpoke (79-79)
  • SupportedClusterTypes (110-113)
fleetconfig-controller/internal/controller/v1beta1/addon.go (3)
fleetconfig-controller/api/v1beta1/constants.go (1)
  • FCCAddOnName (103-103)
fleetconfig-controller/vendor/open-cluster-management.io/api/addon/v1alpha1/register.go (2)
  • GroupName (10-10)
  • Resource (26-28)
fleetconfig-controller/internal/controller/v1beta1/constants.go (1)
  • AddOnDeploymentConfigResource (33-33)
🪛 YAMLlint (1.37.1)
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml

[error] 19-19: syntax error: expected , but found ''

(syntax)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (21)
fleetconfig-controller/config/devspace/hub/manager.yaml (1)

26-32: Env wiring looks good; matches new constants.

CONTROLLER_NAMESPACE from fieldRef and static ROLE_NAME align with v1beta1 constants. No issues.

fleetconfig-controller/internal/webhook/v1beta1/validation_test.go (3)

221-223: Updated hub allowed-fields message is consistent with implementation.

Includes spec.kubeconfig; matches validation.go.


236-238: Updated hub allowed-fields message is consistent with implementation.

Includes spec.kubeconfig; matches validation.go.


255-257: Updated hub allowed-fields message is consistent with implementation.

Includes spec.kubeconfig; matches validation.go.

fleetconfig-controller/internal/webhook/v1beta1/validation.go (4)

24-26: Good: removed local FCC addon name; using API constant.

Avoids string drift.


89-91: Allowed-fields error updated to include kubeconfig.

Matches tests.


348-360: Add-on presence guard based on IsHubAsSpoke is clear and correct.

Checks both allow/deny cases using the shared constant.


378-382: Good: internal errors during hub/CMA listing are returned, not downgraded to warnings.

Prevents invalid resources from slipping through.

fleetconfig-controller/devspace.yaml (7)

12-15: Ports vars added (HUB_PORT/SPOKE_PORT).

Looks fine.


42-43: Start hub dev target in default dev pipeline.

Good separation vs spoke.


58-61: Debug pipeline starts hub dev target.

Consistent with manifest path switch to hub.


62-67: Spoke debug path added.

Assumes hub exists; that’s fine for targeted spoke debugging.

Please confirm this pipeline is intended to run with a pre-existing hub (or via a separate terminal running the hub).


114-116: Wires fleetConfig.enabled flag for dev.

Aligns with profiles; no issues.


137-139: Debug manifests switched to hub.

Matches pipeline intent.


141-144: Spoke debug manifests path correct.

No issues.

fleetconfig-controller/cmd/main.go (2)

66-69: Sane webhook defaults set (cert dir and port).

Prevents ephemeral port issues; good.


86-102: Cluster-type switch delegating to ForHub/ForSpoke looks solid.

Clear failure for unknown types and proper exit behavior.

fleetconfig-controller/api/v1beta1/constants.go (2)

74-107: New cluster-type and env constants look good.

Names/readability align across code and charts.


109-114: SupportedClusterTypes matches usage in main.go.

Good centralization.

fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml (1)

19-19: Quote concatenated Helm names to fix YAML parse error and ensure correctness.

Unquoted concatenations like name: {{ include "chart.fullname" . }}-manager cause YAMLlint syntax errors and can break templating. Quote them consistently (also avoids accidental number parsing).

-          name: {{ include "chart.fullname" . }}-manager
+          name: {{ printf "%s-manager" (include "chart.fullname" .) | quote }}
@@
-          name: {{ include "chart.fullname" . }}-manager-spoke-rolebinding
+          name: {{ printf "%s-manager-spoke-rolebinding" (include "chart.fullname" .) | quote }}
@@
-          name: {{ include "chart.fullname" . }}-manager-spoke-role
+          name: {{ printf "%s-manager-spoke-role" (include "chart.fullname" .) | quote }}
@@
-          name: {{ include "chart.fullname" . }}-leader-election-role
+          name: {{ printf "%s-leader-election-role" (include "chart.fullname" .) | quote }}
@@
-          name: {{ include "chart.fullname" . }}-leader-election-rolebinding
+          name: {{ printf "%s-leader-election-rolebinding" (include "chart.fullname" .) | quote }}
@@
-              clusterRoleName: {{ include "chart.fullname" . }}-manager-role
+              clusterRoleName: {{ printf "%s-manager-role" (include "chart.fullname" .) | quote }}
@@
-          name: {{ include "chart.fullname" . }}-leader-election-role
+          name: {{ printf "%s-leader-election-role" (include "chart.fullname" .) | quote }}

Also applies to: 116-116, 120-121, 128-129, 145-146, 156-157, 163-164

fleetconfig-controller/internal/controller/v1beta1/addon.go (1)

408-448: Bug: wrong field access in AddOnConfig comparisons; and no tolerance for MCA creation delay.

Comparing c.Group/Resource/Name/Namespace is invalid; fields are nested. Also, returning on Get error (e.g., NotFound) races with addon enable.

Apply this diff to fix field access and add a brief poll for MCA readiness plus idempotent config check:

-func patchFCCMca(ctx context.Context, spokeName string, addonC *addonapi.Clientset) error {
-    mca, err := addonC.AddonV1alpha1().ManagedClusterAddOns(spokeName).Get(ctx, v1beta1.FCCAddOnName, metav1.GetOptions{})
-    if err != nil {
-        return fmt.Errorf("failed to configure fleetconfig-controller-manager: %v", err)
-    }
+func patchFCCMca(ctx context.Context, spokeName string, addonC *addonapi.Clientset) error {
+    // wait briefly for MCA to appear after enable
+    var mca *addonv1alpha1.ManagedClusterAddOn
+    if err := wait.PollUntilContextTimeout(ctx, 1*time.Second, 30*time.Second, true, func(ctx context.Context) (bool, error) {
+        obj, err := addonC.AddonV1alpha1().ManagedClusterAddOns(spokeName).Get(ctx, v1beta1.FCCAddOnName, metav1.GetOptions{})
+        if err != nil {
+            if kerrs.IsNotFound(err) {
+                return false, nil
+            }
+            return false, err
+        }
+        mca = obj
+        return true, nil
+    }); err != nil {
+        return fmt.Errorf("failed to locate ManagedClusterAddOn %q in %q: %v", v1beta1.FCCAddOnName, spokeName, err)
+    }
 
     desired := addonv1alpha1.AddOnConfig{
         ConfigGroupResource: addonv1alpha1.ConfigGroupResource{
             Group:    addonv1alpha1.GroupName,
             Resource: AddOnDeploymentConfigResource,
         },
         ConfigReferent: addonv1alpha1.ConfigReferent{
             Name:      v1beta1.FCCAddOnName,
             Namespace: spokeName,
         },
     }
-    if slices.ContainsFunc(mca.Spec.Configs, func(c addonv1alpha1.AddOnConfig) bool {
-        return c.Group == desired.Group &&
-            c.Resource == desired.Resource &&
-            c.Name == desired.Name &&
-            c.Namespace == desired.Namespace
-    }) {
+    if slices.ContainsFunc(mca.Spec.Configs, func(c addonv1alpha1.AddOnConfig) bool {
+        return c.ConfigGroupResource.Group == desired.ConfigGroupResource.Group &&
+            c.ConfigGroupResource.Resource == desired.ConfigGroupResource.Resource &&
+            c.ConfigReferent.Name == desired.ConfigReferent.Name &&
+            c.ConfigReferent.Namespace == desired.ConfigReferent.Namespace
+    }) {
         return nil
     }
     mca.Spec.Configs = append(mca.Spec.Configs, desired)
     patchBytes, err := json.Marshal(map[string]any{
         "spec": map[string]any{"configs": mca.Spec.Configs},
     })
     if err != nil {
         return fmt.Errorf("failed to marshal patch for fleetconfig-controller-manager: %v", err)
     }
     if _, err = addonC.AddonV1alpha1().ManagedClusterAddOns(spokeName).Patch(
         ctx,
         v1beta1.FCCAddOnName,
         types.MergePatchType,
         patchBytes,
         metav1.PatchOptions{},
     ); err != nil {
         return fmt.Errorf("failed to patch fleetconfig-controller-manager: %v", err)
     }
     return nil
 }

Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
fleetconfig-controller/internal/webhook/v1beta1/validation.go (2)

366-379: Return early if prior validation failed (avoid extra hub reads).

If errs already contains hard failures, skip the hub lookup and return immediately.

 	if newObject.IsHubAsSpoke() {
 		...
 	} else {
 		...
 	}
 
+	if len(errs) > 0 {
+		return nil, errs
+	}
+
 	// try to get hub, if not present or not ready, log a warning that addons cant be properly validated

416-425: Typo: isAddondEnabled -> isAddonEnabled (optional).

Rename for clarity; update call site accordingly.

fleetconfig-controller/api/v1beta1/constants.go (2)

99-101: Doc nit: clarify RoleNameEnvVar comment.

Small grammar fix.

-	// RoleNameEnvVar containing the name of the ClusterRole for fleetconfig-controller-manager.
+	// RoleNameEnvVar is the environment variable containing the name of the ClusterRole for fleetconfig-controller-manager.

155-161: Doc nits: fix typos in labels comments.

Minor grammar/typo corrections.

-	// ManagedByLabels are labeles applies to resources to denote that fleetconfig-controller is managing the lifecycle.
+	// ManagedByLabels are labels applied to resources to denote that fleetconfig-controller is managing the lifecycle.
@@
-	// ManagedBySelector is a label selector for filtering add-on resources managed fleetconfig-controller.
+	// ManagedBySelector is a label selector for filtering add-on resources managed by fleetconfig-controller.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d863fed and aaedc87.

📒 Files selected for processing (5)
  • fleetconfig-controller/api/v1beta1/constants.go (4 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/addon.go (10 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/constants.go (1 hunks)
  • fleetconfig-controller/internal/webhook/v1beta1/validation.go (4 hunks)
  • fleetconfig-controller/internal/webhook/v1beta1/validation_test.go (4 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.

Applied to files:

  • fleetconfig-controller/internal/webhook/v1beta1/validation.go
📚 Learning: 2025-09-25T23:18:41.545Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.545Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.

Applied to files:

  • fleetconfig-controller/api/v1beta1/constants.go
  • fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: The waitForAddonManifestWorksCleanup function is called specifically during spoke cleanup/unjoin operations, only when addons were previously enabled, and after addon disable operations. In this controlled context, checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks since other ManifestWorks would have been handled earlier in the cleanup flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: In the fleetconfig-controller addon cleanup flow, waitForAddonManifestWorksCleanup is only called in contexts where non-addon ManifestWorks have already been handled, so checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/addon.go
🧬 Code graph analysis (3)
fleetconfig-controller/internal/webhook/v1beta1/validation.go (2)
fleetconfig-controller/api/v1beta1/constants.go (3)
  • SupportedHubAddons (168-171)
  • FCCAddOnName (103-103)
  • HubInitialized (19-19)
fleetconfig-controller/api/v1beta1/spoke_types.go (2)
  • AddOn (241-253)
  • HubRef (86-94)
fleetconfig-controller/internal/webhook/v1beta1/validation_test.go (1)
fleetconfig-controller/api/v1beta1/hub_types.go (4)
  • Hub (243-257)
  • HubSpec (26-79)
  • AddOnConfig (158-181)
  • HubAddOn (184-198)
fleetconfig-controller/internal/controller/v1beta1/addon.go (3)
fleetconfig-controller/api/v1beta1/constants.go (4)
  • SupportedHubAddons (168-171)
  • LabelAddOnManagedBy (121-121)
  • FCCAddOnName (103-103)
  • AddonArgoCD (164-164)
fleetconfig-controller/vendor/open-cluster-management.io/api/addon/v1alpha1/register.go (2)
  • GroupName (10-10)
  • Resource (26-28)
fleetconfig-controller/internal/controller/v1beta1/constants.go (1)
  • AddOnDeploymentConfigResource (30-30)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
🔇 Additional comments (7)
fleetconfig-controller/internal/webhook/v1beta1/validation_test.go (1)

222-257: LGTM: error message updated to include kubeconfig.

The updated allowed-fields message for Hub updates matches the validation logic.

fleetconfig-controller/internal/webhook/v1beta1/validation.go (1)

152-158: LGTM: enforce SupportedHubAddons.

Validates HubAddOn names against the centralized list.

fleetconfig-controller/internal/controller/v1beta1/addon.go (3)

387-396: LGTM: FCC MCA patch gated and aggregated.

Patching only when FCC addon is enabled and aggregating errors is sound.


690-699: Avoid brittle equality on total ManifestWorks; detect addon-owned works instead.

Comparing len(manifestWorks) to 0/1 risks timeouts if unrelated works exist or counts fluctuate. Detect presence of addon-owned ManifestWorks via OwnerReferences and complete when none remain.

-		// for hub-as-spoke, all addons must be removed.
-		// otherwise, fleetconfig-controller-manager must not be removed.
-		var expectedWorks = 1
-		if isHubAsSpoke {
-			expectedWorks = 0
-		}
-
-		if len(manifestWorks.Items) == expectedWorks {
+		// Complete when no ManifestWorks remain that are owned by ManagedClusterAddOn
+		hasAddonWorks := slices.ContainsFunc(manifestWorks.Items, func(mw workv1.ManifestWork) bool {
+			return slices.ContainsFunc(mw.OwnerReferences, func(or metav1.OwnerReference) bool {
+				return or.Kind == managedClusterAddOn
+			})
+		})
+		if !hasAddonWorks {
 			logger.V(1).Info("addon manifestWorks cleanup completed", "spokeName", spokeName, "remainingManifestWorks", len(manifestWorks.Items))
 			return true, nil
 		}

This matches the documented cleanup flow where non-addon works are handled earlier. Based on learnings

⛔ Skipped due to learnings
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: The waitForAddonManifestWorksCleanup function is called specifically during spoke cleanup/unjoin operations, only when addons were previously enabled, and after addon disable operations. In this controlled context, checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks since other ManifestWorks would have been handled earlier in the cleanup flow.
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: In the fleetconfig-controller addon cleanup flow, waitForAddonManifestWorksCleanup is only called in contexts where non-addon ManifestWorks have already been handled, so checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks.
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.545Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.

408-448: Fix field access and make patch idempotent with eventual-consistency wait.

  • c.Group/Resource/Name/Namespace don’t exist on AddOnConfig; use nested fields.
  • GET may race right after enable; poll for creation and avoid duplicate configs.
-func patchFCCMca(ctx context.Context, spokeName string, addonC *addonapi.Clientset) error {
-	mca, err := addonC.AddonV1alpha1().ManagedClusterAddOns(spokeName).Get(ctx, v1beta1.FCCAddOnName, metav1.GetOptions{})
-	if err != nil {
-		return fmt.Errorf("failed to configure fleetconfig-controller-manager: %v", err)
-	}
+func patchFCCMca(ctx context.Context, spokeName string, addonC *addonapi.Clientset) error {
+	// Wait briefly for MCA to appear after enable
+	var mca *addonv1alpha1.ManagedClusterAddOn
+	if err := wait.PollUntilContextTimeout(ctx, 1*time.Second, 30*time.Second, true, func(ctx context.Context) (bool, error) {
+		obj, err := addonC.AddonV1alpha1().ManagedClusterAddOns(spokeName).Get(ctx, v1beta1.FCCAddOnName, metav1.GetOptions{})
+		if err != nil {
+			if kerrs.IsNotFound(err) {
+				return false, nil
+			}
+			return false, err
+		}
+		mca = obj
+		return true, nil
+	}); err != nil {
+		return fmt.Errorf("failed to locate ManagedClusterAddOn %q in %q: %v", v1beta1.FCCAddOnName, spokeName, err)
+	}
 	desired := addonv1alpha1.AddOnConfig{
 		ConfigGroupResource: addonv1alpha1.ConfigGroupResource{
 			Group:    addonv1alpha1.GroupName,
 			Resource: AddOnDeploymentConfigResource,
 		},
 		ConfigReferent: addonv1alpha1.ConfigReferent{
 			Name:      v1beta1.FCCAddOnName,
 			Namespace: spokeName,
 		},
 	}
-	if slices.ContainsFunc(mca.Spec.Configs, func(c addonv1alpha1.AddOnConfig) bool {
-		return c.Group == desired.Group &&
-			c.Resource == desired.Resource &&
-			c.Name == desired.Name &&
-			c.Namespace == desired.Namespace
-	}) {
+	if slices.ContainsFunc(mca.Spec.Configs, func(c addonv1alpha1.AddOnConfig) bool {
+		return c.ConfigGroupResource.Group == desired.ConfigGroupResource.Group &&
+			c.ConfigGroupResource.Resource == desired.ConfigGroupResource.Resource &&
+			c.ConfigReferent.Name == desired.ConfigReferent.Name &&
+			c.ConfigReferent.Namespace == desired.ConfigReferent.Namespace
+	}) {
 		return nil
 	}
 	mca.Spec.Configs = append(mca.Spec.Configs, desired)
 	patchBytes, err := json.Marshal(map[string]any{
 		"spec": map[string]any{"configs": mca.Spec.Configs},
 	})
 	if err != nil {
 		return fmt.Errorf("failed to marshal patch for fleetconfig-controller-manager: %v", err)
 	}
 	if _, err = addonC.AddonV1alpha1().ManagedClusterAddOns(spokeName).Patch(
 		ctx,
 		v1beta1.FCCAddOnName,
 		types.MergePatchType,
 		patchBytes,
 		metav1.PatchOptions{},
 	); err != nil {
 		return fmt.Errorf("failed to patch fleetconfig-controller-manager: %v", err)
 	}
 	return nil
 }

This aligns with prior feedback to make the patch resilient and idempotent. Based on learnings

fleetconfig-controller/api/v1beta1/constants.go (1)

93-95: LGTM: fixed HubNamespaceEnvVar comment.

Comment now correctly references the Hub resource.

fleetconfig-controller/internal/controller/v1beta1/constants.go (1)

35-36: Label constant encodes key=value; split into key and value.

Encoding the selector as "key=value" will produce an invalid label key when used in ObjectMeta.Labels. Define separate key and value constants.

-	fccAddOnManifestWorkLabel = "open-cluster-management.io/addon-name=fleetconfig-controller-manager"
+	fccAddOnManifestWorkLabelKey   = "open-cluster-management.io/addon-name"
+	fccAddOnManifestWorkLabelValue = "fleetconfig-controller-manager"

Adjust call sites to set labels as: map[string]string{fccAddOnManifestWorkLabelKey: fccAddOnManifestWorkLabelValue}.

Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
fleetconfig-controller/api/v1beta1/constants.go (2)

141-143: Doc fix: correct the URL key comment.

The comment should reference AddonConfigMapManifestURLKey, not AddonConfigMapManifestRawKey.

Apply this diff:

-	// AddonConfigMapManifestRawKey is the data key containing a URL to download manifests.
+	// AddonConfigMapManifestURLKey is the data key containing a URL to download manifests.

155-161: Typo in docstring for ManagedByLabels.

“labeles applies” → “labels applied”.

Apply this diff:

-	// ManagedByLabels are labeles applies to resources to denote that fleetconfig-controller is managing the lifecycle.
+	// ManagedByLabels are labels applied to resources to denote that fleetconfig-controller is managing the lifecycle.
fleetconfig-controller/internal/webhook/v1beta1/validation.go (1)

347-351: Fix potential nil-pointer deref and typo (isAddondEnabled → isAddonEnabled).

cr.DesiredConfig can be nil; dereferencing crashes the webhook. Also fix the function name typo.

Apply this diff:

-		if isAddondEnabled(mcAddOns.Items, removedAddon) {
+		if isAddonEnabled(mcAddOns.Items, removedAddon) {
 			inUseAddons = append(inUseAddons, removedAddon)
 		}
@@
-// isAddonConfigInUse checks if a removed addon config is still referenced by any ManagedClusterAddOn.
-func isAddondEnabled(mcAddOns []addonv1alpha1.ManagedClusterAddOn, removedAddon string) bool {
+// isAddonEnabled checks if a removed addon config is still referenced by any ManagedClusterAddOn.
+func isAddonEnabled(mcAddOns []addonv1alpha1.ManagedClusterAddOn, removedAddon string) bool {
 	for _, mcao := range mcAddOns {
 		for _, cr := range mcao.Status.ConfigReferences {
-			if cr.DesiredConfig.Name == removedAddon {
+			if cr.DesiredConfig == nil {
+				continue
+			}
+			if cr.DesiredConfig.Name == removedAddon {
 				return true
 			}
 		}
 	}
 	return false
 }

Also applies to: 415-424

🧹 Nitpick comments (5)
fleetconfig-controller/internal/webhook/v1beta1/validation.go (1)

379-387: Optional: return early if add-on list is already invalid.

Short-circuit on existing errs to skip hub lookup and reduce noisy warnings/work.

Apply this diff:

-	// try to get hub, if not present or not ready, log a warning that addons cant be properly validated
+	// Bail out early if we already have hard validation failures.
+	if len(errs) > 0 {
+		return nil, errs
+	}
+	// Try to get hub; if not present or not ready, warn that add-ons can't be validated.
 	hub := &v1beta1.Hub{}
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (4)

184-188: Use the central FCC add-on name constant to avoid drift.

Prefer v1beta1.FCCAddOnName over a string literal.

Apply this diff:

-				Name:      "fleetconfig-controller-manager",
+				Name:      v1beta1.FCCAddOnName,

353-356: Use the central FCC add-on name constant during cleanup.

Avoid hardcoding the add-on name.

Apply this diff:

-	if !spoke.IsHubAsSpoke() {
-		spokeCopy.Spec.AddOns = append(spokeCopy.Spec.AddOns, v1beta1.AddOn{ConfigName: "fleetconfig-controller-manager"}) // disable all except fcc
-	}
+	if !spoke.IsHubAsSpoke() {
+		// disable all except FCC
+		spokeCopy.Spec.AddOns = append(spokeCopy.Spec.AddOns, v1beta1.AddOn{ConfigName: v1beta1.FCCAddOnName})
+	}

28-29: Unify registration driver constant to v1beta1 and drop the v1alpha1 import.

Keeps constants in one API surface.

Apply this diff:

@@
-	"github.com/open-cluster-management-io/lab/fleetconfig-controller/api/v1alpha1"
 	"github.com/open-cluster-management-io/lab/fleetconfig-controller/api/v1beta1"
@@
-	if ra.Driver == v1alpha1.AWSIRSARegistrationDriver {
+	if ra.Driver == v1beta1.AWSIRSARegistrationDriver {
 		raArgs := []string{
 			fmt.Sprintf("--registration-auth=%s", ra.Driver),
 		}

And remove the now-unused v1alpha1 import.

Also applies to: 585-597


952-955: Merge semantics vs comment: zero-values from spec are not ignored.

mergo.Map with WithOverride will overwrite keys even with zero-values in specInterface. Either update the comment to reflect current behavior, or change the merge to skip zero-values from spec.

Apply this doc tweak if keeping current behavior:

-	// Merge spec on top but ignore zero-values from spec
+	// Merge spec values on top (overriding overlapping keys). Note: zero-values in spec also override.

Alternatively, prune zero-like values from specInterface before merging; I can provide a small helper to do that if desired.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aaedc87 and 0b0b000.

📒 Files selected for processing (4)
  • fleetconfig-controller/api/v1beta1/constants.go (4 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/constants.go (1 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (1 hunks)
  • fleetconfig-controller/internal/webhook/v1beta1/validation.go (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • fleetconfig-controller/internal/controller/v1beta1/constants.go
🧰 Additional context used
🧠 Learnings (7)
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.

Applied to files:

  • fleetconfig-controller/internal/webhook/v1beta1/validation.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-25T23:18:41.545Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.545Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.

Applied to files:

  • fleetconfig-controller/api/v1beta1/constants.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-22T19:16:34.109Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/webhook/v1beta1/validation.go:103-121
Timestamp: 2025-09-22T19:16:34.109Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller v1beta1 API, the Klusterlet field in SpokeSpec is defined as a struct value (Klusterlet Klusterlet), not a pointer (*Klusterlet), so direct field access like Klusterlet.Annotations is safe without nil checks. The Klusterlet struct does not contain a Source field.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
🧬 Code graph analysis (2)
fleetconfig-controller/internal/webhook/v1beta1/validation.go (2)
fleetconfig-controller/api/v1beta1/constants.go (3)
  • SupportedHubAddons (172-175)
  • FCCAddOnName (103-103)
  • HubInitialized (19-19)
fleetconfig-controller/api/v1beta1/spoke_types.go (2)
  • AddOn (241-253)
  • HubRef (86-94)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (9)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
  • SpokeReconciler (46-52)
fleetconfig-controller/api/v1beta1/spoke_types.go (5)
  • Spoke (281-295)
  • KlusterletChartConfig (186-188)
  • HubRef (86-94)
  • AddOn (241-253)
  • Klusterlet (107-169)
fleetconfig-controller/api/v1beta1/constants.go (19)
  • ClusterTypeHub (76-76)
  • ClusterTypeSpoke (79-79)
  • SupportedClusterTypes (110-113)
  • SpokeJoined (28-28)
  • LabelManagedClusterType (118-118)
  • ManagedClusterTypeHubAsSpoke (71-71)
  • ControllerNamespaceEnvVar (97-97)
  • HubNamespaceEnvVar (94-94)
  • SpokeNamespaceEnvVar (91-91)
  • AddonsConfigured (22-22)
  • RoleNameEnvVar (100-100)
  • DefaultFCCManagerRole (106-106)
  • FCCAddOnName (103-103)
  • PivotComplete (31-31)
  • HubCleanupPreflightFinalizer (7-7)
  • SpokeCleanupFinalizer (13-13)
  • HubCleanupFinalizer (10-10)
  • HubInitialized (19-19)
  • AWSIRSARegistrationDriver (130-130)
fleetconfig-controller/pkg/common/common.go (6)
  • ClusterClient (20-30)
  • AddOnClient (59-69)
  • GetManagedCluster (72-81)
  • UpdateManagedCluster (84-89)
  • WorkClient (46-56)
  • OperatorClient (33-43)
fleetconfig-controller/internal/kube/kube.go (3)
  • KubeconfigFromSecretOrCluster (113-119)
  • RawFromInClusterRestConfig (75-81)
  • RestConfigFromKubeconfig (22-32)
fleetconfig-controller/internal/hash/hash.go (1)
  • ComputeHash (11-22)
fleetconfig-controller/internal/args/args.go (2)
  • PrepareResources (32-55)
  • PrepareKubeconfig (15-29)
fleetconfig-controller/internal/exec/exec.go (1)
  • CmdWithLogs (18-54)
fleetconfig-controller/internal/version/version.go (2)
  • LowestBundleVersion (15-46)
  • Normalize (49-55)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (1)
fleetconfig-controller/api/v1beta1/constants.go (1)

151-153: Allowing http for add-on manifests may leak integrity.

Do we intentionally allow plain http here? Consider restricting to https only or gating http behind a dev/debug flag.

Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
fleetconfig-controller/cmd/manager/manager.go (3)

166-177: Trim whitespace when validating required env vars.

Rejecting only empty string allows whitespace-only values. Trim to avoid accidental misconfig.

Apply this diff:

-	spokeNamespace := os.Getenv(apiv1beta1.SpokeNamespaceEnvVar)
-	if spokeNamespace == "" {
+	spokeNamespace := strings.TrimSpace(os.Getenv(apiv1beta1.SpokeNamespaceEnvVar))
+	if spokeNamespace == "" {
 		err = fmt.Errorf("CLUSTER_NAMESPACE environment variable must be set")
 		setupLog.Error(err, "unable to create controller", "controller", "Spoke")
 		return nil, err
 	}
-	hubNamespace := os.Getenv(apiv1beta1.HubNamespaceEnvVar)
-	if hubNamespace == "" {
+	hubNamespace := strings.TrimSpace(os.Getenv(apiv1beta1.HubNamespaceEnvVar))
+	if hubNamespace == "" {
 		err = fmt.Errorf("HUB_NAMESPACE environment variable must be set")
 		setupLog.Error(err, "unable to create controller", "controller", "Spoke")
 		return nil, err
 	}

And add:

+	"strings"

to imports.


150-151: Avoid constructing an unused webhook server in ForSpoke.

You’re only using TLSOpts for metrics; creating a webhook.Server is unnecessary work.

Apply this diff:

-	_, tlsOpts := setupServer(opts, setupLog)
+	tlsOpts := []func(*tls.Config){}
+	if !opts.EnableHTTP2 {
+		tlsOpts = append(tlsOpts, func(c *tls.Config) { setupLog.Info("disabling http/2"); c.NextProtos = []string{"http/1.1"} })
+	}

274-277: Prefer parsing kubeconfig over raw string replacement.

Byte-wise ReplaceAll may unintentionally rewrite unrelated content. Parse and update client-certificate/client-key fields for robustness.

Apply this diff:

-	// patch the kubeconfig with absolute paths to cert/key
-	hubKubeconfig = bytes.ReplaceAll(hubKubeconfig, []byte("tls.crt"), []byte(certPath))
-	hubKubeconfig = bytes.ReplaceAll(hubKubeconfig, []byte("tls.key"), []byte(keyPath))
+	// patch kubeconfig with absolute cert/key paths
+	type kubeCfg struct {
+		Users []struct {
+			Name string `json:"name" yaml:"name"`
+			User struct {
+				ClientCertificate string `json:"client-certificate,omitempty" yaml:"client-certificate,omitempty"`
+				ClientKey         string `json:"client-key,omitempty" yaml:"client-key,omitempty"`
+			} `json:"user" yaml:"user"`
+		} `json:"users" yaml:"users"`
+	}
+	var kc kubeCfg
+	if err := yaml.Unmarshal(hubKubeconfig, &kc); err == nil {
+		for i := range kc.Users {
+			if kc.Users[i].User.ClientCertificate != "" {
+				kc.Users[i].User.ClientCertificate = certPath
+			}
+			if kc.Users[i].User.ClientKey != "" {
+				kc.Users[i].User.ClientKey = keyPath
+			}
+		}
+		if patched, err := yaml.Marshal(kc); err == nil {
+			hubKubeconfig = patched
+		}
+	}

Note: CA patching isn’t needed since OCM addon kubeconfigs embed certificate-authority-data.
Based on learnings

fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (2)

720-724: Use a constant for the ManagedClusterJoined condition.

Avoid hardcoding the string; prefer the upstream constant if available to prevent typos.

Example:

if c.Type == clusterv1.ManagedClusterConditionJoined { ... }

972-979: Merge semantics comment vs. code clarity.

The comment says “ignore zero-values from spec” which matches mergo’s default behavior (even with WithOverride, zero values are still ignored). Consider clarifying this to avoid confusion.

Proposed wording: “Merge spec on top; non-empty spec values override values from ConfigMap.”

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2ba3cbf and 674fb05.

📒 Files selected for processing (5)
  • fleetconfig-controller/api/v1beta1/spoke_types.go (2 hunks)
  • fleetconfig-controller/cmd/manager/manager.go (1 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/addon.go (10 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (1 hunks)
  • fleetconfig-controller/internal/webhook/v1beta1/validation.go (4 hunks)
🧰 Additional context used
🧠 Learnings (13)
📓 Common learnings
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.
📚 Learning: 2025-09-26T04:15:34.443Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/webhook/v1beta1/validation_test.go:699-701
Timestamp: 2025-09-26T04:15:34.443Z
Learning: In fleetconfig-controller/internal/webhook/v1beta1/validation_test.go, the TestValidateAddonUniqueness test has flawed assertion logic at lines 763-769 that uses `if len(err.Error()) >= len(expectedMsg)` for partial matching, which causes false positives when actual error messages are longer than expected messages, allowing tests to pass with completely incorrect expected error messages.

Applied to files:

  • fleetconfig-controller/internal/webhook/v1beta1/validation.go
📚 Learning: 2025-09-26T04:13:12.098Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/webhook/v1beta1/validation_test.go:719-724
Timestamp: 2025-09-26T04:13:12.098Z
Learning: The TestValidateAddonUniqueness function in fleetconfig-controller/internal/webhook/v1beta1/validation_test.go has flawed assertion logic that uses `len(err.Error()) >= len(expectedMsg)` to check error messages, which doesn't actually validate message content and will pass for any error longer than the expected message.

Applied to files:

  • fleetconfig-controller/internal/webhook/v1beta1/validation.go
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.

Applied to files:

  • fleetconfig-controller/internal/webhook/v1beta1/validation.go
  • fleetconfig-controller/api/v1beta1/spoke_types.go
📚 Learning: 2025-09-25T23:18:41.545Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.545Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.

Applied to files:

  • fleetconfig-controller/api/v1beta1/spoke_types.go
  • fleetconfig-controller/internal/controller/v1beta1/addon.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-22T19:16:34.109Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/webhook/v1beta1/validation.go:103-121
Timestamp: 2025-09-22T19:16:34.109Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller v1beta1 API, the Klusterlet field in SpokeSpec is defined as a struct value (Klusterlet Klusterlet), not a pointer (*Klusterlet), so direct field access like Klusterlet.Annotations is safe without nil checks. The Klusterlet struct does not contain a Source field.

Applied to files:

  • fleetconfig-controller/api/v1beta1/spoke_types.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-25T23:26:18.303Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/cmd/manager/manager.go:290-292
Timestamp: 2025-09-25T23:26:18.303Z
Learning: ManagedClusterAddOn hub kubeconfigs in Open Cluster Management use certificate-authority-data (base64-encoded inline CA certificate) rather than certificate-authority (file path reference). This makes the kubeconfig self-contained and eliminates the need to patch CA file paths.

Applied to files:

  • fleetconfig-controller/cmd/manager/manager.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: The waitForAddonManifestWorksCleanup function is called specifically during spoke cleanup/unjoin operations, only when addons were previously enabled, and after addon disable operations. In this controlled context, checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks since other ManifestWorks would have been handled earlier in the cleanup flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: In the fleetconfig-controller addon cleanup flow, waitForAddonManifestWorksCleanup is only called in contexts where non-addon ManifestWorks have already been handled, so checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
🧬 Code graph analysis (4)
fleetconfig-controller/internal/webhook/v1beta1/validation.go (2)
fleetconfig-controller/api/v1beta1/constants.go (3)
  • SupportedHubAddons (172-175)
  • FCCAddOnName (103-103)
  • HubInitialized (19-19)
fleetconfig-controller/api/v1beta1/spoke_types.go (2)
  • AddOn (255-267)
  • HubRef (87-95)
fleetconfig-controller/api/v1beta1/spoke_types.go (2)
fleetconfig-controller/api/v1beta1/constants.go (4)
  • ManagedClusterTypeHubAsSpoke (71-71)
  • PivotComplete (31-31)
  • SpokeJoined (28-28)
  • SpokeCleanupFinalizer (13-13)
fleetconfig-controller/api/v1beta1/common.go (1)
  • Kubeconfig (13-29)
fleetconfig-controller/internal/controller/v1beta1/addon.go (3)
fleetconfig-controller/api/v1beta1/constants.go (4)
  • SupportedHubAddons (172-175)
  • LabelAddOnManagedBy (121-121)
  • FCCAddOnName (103-103)
  • AddonArgoCD (165-165)
fleetconfig-controller/vendor/open-cluster-management.io/api/addon/v1alpha1/register.go (2)
  • GroupName (10-10)
  • Resource (26-28)
fleetconfig-controller/internal/controller/v1beta1/constants.go (1)
  • AddOnDeploymentConfigResource (30-30)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (9)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
  • SpokeReconciler (46-52)
fleetconfig-controller/api/v1beta1/spoke_types.go (5)
  • Spoke (295-309)
  • KlusterletChartConfig (200-202)
  • HubRef (87-95)
  • AddOn (255-267)
  • Klusterlet (121-183)
fleetconfig-controller/api/v1beta1/constants.go (19)
  • ClusterTypeHub (76-76)
  • PivotComplete (31-31)
  • ClusterTypeSpoke (79-79)
  • SupportedClusterTypes (110-113)
  • SpokeJoined (28-28)
  • LabelManagedClusterType (118-118)
  • ManagedClusterTypeHubAsSpoke (71-71)
  • ControllerNamespaceEnvVar (97-97)
  • HubNamespaceEnvVar (94-94)
  • SpokeNamespaceEnvVar (91-91)
  • AddonsConfigured (22-22)
  • RoleNameEnvVar (100-100)
  • DefaultFCCManagerRole (106-106)
  • FCCAddOnName (103-103)
  • HubCleanupPreflightFinalizer (7-7)
  • SpokeCleanupFinalizer (13-13)
  • HubCleanupFinalizer (10-10)
  • HubInitialized (19-19)
  • AWSIRSARegistrationDriver (130-130)
fleetconfig-controller/pkg/common/common.go (6)
  • ClusterClient (20-30)
  • AddOnClient (59-69)
  • GetManagedCluster (72-81)
  • UpdateManagedCluster (84-89)
  • WorkClient (46-56)
  • OperatorClient (33-43)
fleetconfig-controller/internal/kube/kube.go (3)
  • KubeconfigFromSecretOrCluster (113-119)
  • RawFromInClusterRestConfig (75-81)
  • RestConfigFromKubeconfig (22-32)
fleetconfig-controller/internal/hash/hash.go (1)
  • ComputeHash (11-22)
fleetconfig-controller/internal/args/args.go (2)
  • PrepareResources (32-55)
  • PrepareKubeconfig (15-29)
fleetconfig-controller/internal/exec/exec.go (1)
  • CmdWithLogs (18-54)
fleetconfig-controller/internal/version/version.go (2)
  • LowestBundleVersion (15-46)
  • Normalize (49-55)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (10)
fleetconfig-controller/internal/controller/v1beta1/addon.go (1)

408-446: Reintroduce eventual-consistency handling when patching the MCA

Right after clusteradm addon enable, the ManagedClusterAddOn is often absent for a few seconds. The current Get call treats that as a hard failure, so enabling FCC on a busy or fresh spoke will routinely break again—the same race we fixed earlier. Please restore the short poll / NotFound tolerance so we wait for the MCA to appear before patching.

Apply this diff to fix it:

 func patchFCCMca(ctx context.Context, spokeName string, addonC *addonapi.Clientset) error {
-	mca, err := addonC.AddonV1alpha1().ManagedClusterAddOns(spokeName).Get(ctx, v1beta1.FCCAddOnName, metav1.GetOptions{})
-	if err != nil {
-		return fmt.Errorf("failed to configure fleetconfig-controller-manager: %v", err)
-	}
-	desired := addonv1alpha1.AddOnConfig{
+	// Wait briefly for the MCA to show up after enable
+	var mca *addonv1alpha1.ManagedClusterAddOn
+	if err := wait.PollUntilContextTimeout(ctx, time.Second, 30*time.Second, true, func(ctx context.Context) (bool, error) {
+		obj, err := addonC.AddonV1alpha1().ManagedClusterAddOns(spokeName).Get(ctx, v1beta1.FCCAddOnName, metav1.GetOptions{})
+		if err != nil {
+			if kerrs.IsNotFound(err) {
+				return false, nil
+			}
+			return false, err
+		}
+		mca = obj
+		return true, nil
+	}); err != nil {
+		return fmt.Errorf("failed to locate ManagedClusterAddOn %q in %q: %v", v1beta1.FCCAddOnName, spokeName, err)
+	}
+
+	desired := addonv1alpha1.AddOnConfig{
 		ConfigGroupResource: addonv1alpha1.ConfigGroupResource{
 			Group:    addonv1alpha1.GroupName,
 			Resource: AddOnDeploymentConfigResource,
 		},
 		ConfigReferent: addonv1alpha1.ConfigReferent{
fleetconfig-controller/internal/webhook/v1beta1/validation.go (2)

152-157: Nice guardrail on supported Hub add-ons.

Validating the Hub add-on names against v1beta1.SupportedHubAddons will catch typos and unsanctioned entries up front while keeping the constant in one place. Clean, centralized check.


365-377: Good enforcement around the FCC add-on.

Differentiating hub-as-spoke clusters from regular spokes and wiring both cases through the shared v1beta1.FCCAddOnName constant keeps the admission rules tight and eliminates drift. Nicely done.

fleetconfig-controller/api/v1beta1/spoke_types.go (2)

102-106: LGTM: Hub-as-Spoke detection looks consistent.

Name check and in-cluster kubeconfig are reasonable signals for hub-as-spoke. No issues spotted.


107-118: LGTM: PivotComplete gating is sound.

Requiring SpokeJoined=true, PivotComplete=true, and presence of SpokeCleanupFinalizer aligns with the deletion flow ordering and avoids premature hub cleanup.

Based on learnings

fleetconfig-controller/cmd/manager/manager.go (2)

265-268: Good fix: derive cert/key base path via filepath.Dir.

This addresses prior fragility with TrimSuffix("kubeconfig").


259-283: CA handling is correct; no need to patch certificate-authority.

OCM addon hub kubeconfigs use certificate-authority-data, so cert/key path rewrites are sufficient.

If you expect any kubeconfigs with certificate-authority (file path), confirm and we can extend the patcher. Otherwise this is good as-is. Based on learnings

fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (3)

496-527: Double-check “self-destruct” namespace deletions.

Deleting CONTROLLER_NAMESPACE on the spoke is intentional, but ensure the manager isn’t running against that same cluster at this point to avoid deleting itself mid-reconcile in a way that leaves dependent resources half-cleaned.

Would you like a preflight check that skips deleting the controller namespace if LeaderElection is still active?


671-672: Redact hub token in logs.

joinArgs includes --hub-token; logging it leaks credentials.

Apply this diff:

-	logger.V(1).Info("clusteradm join", "args", joinArgs)
+	// Redact sensitive args before logging
+	maskedArgs := make([]string, len(joinArgs))
+	copy(maskedArgs, joinArgs)
+	for i := 0; i < len(maskedArgs)-1; i++ {
+		if maskedArgs[i] == "--hub-token" {
+			maskedArgs[i+1] = "REDACTED"
+			break
+		}
+	}
+	logger.V(1).Info("clusteradm join", "args", maskedArgs)

897-897: Stop logging the token payload.

stdout contains the token JSON; don’t emit it.

Apply this diff:

-	logger.V(1).Info("got join token", "output", string(stdout))
+	logger.V(1).Info("got join token")

Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 674fb05 and e511abd.

📒 Files selected for processing (4)
  • fleetconfig-controller/api/v1beta1/spoke_types.go (1 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (1 hunks)
  • fleetconfig-controller/internal/webhook/v1beta1/validation.go (7 hunks)
  • fleetconfig-controller/internal/webhook/v1beta1/validation_test.go (8 hunks)
🧰 Additional context used
🧠 Learnings (10)
📓 Common learnings
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.
📚 Learning: 2025-09-25T23:18:41.545Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.545Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.

Applied to files:

  • fleetconfig-controller/api/v1beta1/spoke_types.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.

Applied to files:

  • fleetconfig-controller/api/v1beta1/spoke_types.go
  • fleetconfig-controller/internal/webhook/v1beta1/validation.go
  • fleetconfig-controller/internal/webhook/v1beta1/validation_test.go
📚 Learning: 2025-09-22T19:16:34.109Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/webhook/v1beta1/validation.go:103-121
Timestamp: 2025-09-22T19:16:34.109Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller v1beta1 API, the Klusterlet field in SpokeSpec is defined as a struct value (Klusterlet Klusterlet), not a pointer (*Klusterlet), so direct field access like Klusterlet.Annotations is safe without nil checks. The Klusterlet struct does not contain a Source field.

Applied to files:

  • fleetconfig-controller/api/v1beta1/spoke_types.go
  • fleetconfig-controller/internal/webhook/v1beta1/validation.go
  • fleetconfig-controller/internal/webhook/v1beta1/validation_test.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-26T04:15:34.443Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/webhook/v1beta1/validation_test.go:699-701
Timestamp: 2025-09-26T04:15:34.443Z
Learning: In fleetconfig-controller/internal/webhook/v1beta1/validation_test.go, the TestValidateAddonUniqueness test has flawed assertion logic at lines 763-769 that uses `if len(err.Error()) >= len(expectedMsg)` for partial matching, which causes false positives when actual error messages are longer than expected messages, allowing tests to pass with completely incorrect expected error messages.

Applied to files:

  • fleetconfig-controller/internal/webhook/v1beta1/validation.go
  • fleetconfig-controller/internal/webhook/v1beta1/validation_test.go
📚 Learning: 2025-09-26T04:13:12.098Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/webhook/v1beta1/validation_test.go:719-724
Timestamp: 2025-09-26T04:13:12.098Z
Learning: The TestValidateAddonUniqueness function in fleetconfig-controller/internal/webhook/v1beta1/validation_test.go has flawed assertion logic that uses `len(err.Error()) >= len(expectedMsg)` to check error messages, which doesn't actually validate message content and will pass for any error longer than the expected message.

Applied to files:

  • fleetconfig-controller/internal/webhook/v1beta1/validation.go
  • fleetconfig-controller/internal/webhook/v1beta1/validation_test.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/webhook/v1beta1/validation.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/webhook/v1beta1/validation.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.

Applied to files:

  • fleetconfig-controller/internal/webhook/v1beta1/validation.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.

Applied to files:

  • fleetconfig-controller/internal/webhook/v1beta1/validation.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
🧬 Code graph analysis (4)
fleetconfig-controller/api/v1beta1/spoke_types.go (3)
fleetconfig-controller/api/v1alpha1/fleetconfig_types.go (2)
  • Spoke (350-392)
  • Kubeconfig (301-317)
fleetconfig-controller/api/v1beta1/constants.go (3)
  • ManagedClusterTypeHubAsSpoke (71-71)
  • PivotComplete (31-31)
  • SpokeJoined (28-28)
fleetconfig-controller/api/v1beta1/common.go (1)
  • Kubeconfig (13-29)
fleetconfig-controller/internal/webhook/v1beta1/validation.go (2)
fleetconfig-controller/api/v1beta1/spoke_types.go (3)
  • Klusterlet (117-179)
  • AddOn (251-263)
  • HubRef (86-94)
fleetconfig-controller/api/v1beta1/constants.go (3)
  • SupportedHubAddons (172-175)
  • FCCAddOnName (103-103)
  • HubInitialized (19-19)
fleetconfig-controller/internal/webhook/v1beta1/validation_test.go (1)
fleetconfig-controller/api/v1beta1/hub_types.go (4)
  • Hub (243-257)
  • HubSpec (26-79)
  • AddOnConfig (158-181)
  • HubAddOn (184-198)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (9)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
  • SpokeReconciler (46-52)
fleetconfig-controller/api/v1beta1/spoke_types.go (5)
  • Spoke (291-305)
  • KlusterletChartConfig (196-198)
  • HubRef (86-94)
  • AddOn (251-263)
  • Klusterlet (117-179)
fleetconfig-controller/api/v1beta1/constants.go (19)
  • ClusterTypeHub (76-76)
  • PivotComplete (31-31)
  • ClusterTypeSpoke (79-79)
  • SupportedClusterTypes (110-113)
  • SpokeJoined (28-28)
  • LabelManagedClusterType (118-118)
  • ManagedClusterTypeHubAsSpoke (71-71)
  • ControllerNamespaceEnvVar (97-97)
  • HubNamespaceEnvVar (94-94)
  • SpokeNamespaceEnvVar (91-91)
  • AddonsConfigured (22-22)
  • RoleNameEnvVar (100-100)
  • DefaultFCCManagerRole (106-106)
  • FCCAddOnName (103-103)
  • HubCleanupPreflightFinalizer (7-7)
  • SpokeCleanupFinalizer (13-13)
  • HubCleanupFinalizer (10-10)
  • HubInitialized (19-19)
  • AWSIRSARegistrationDriver (130-130)
fleetconfig-controller/pkg/common/common.go (6)
  • ClusterClient (20-30)
  • AddOnClient (59-69)
  • GetManagedCluster (72-81)
  • UpdateManagedCluster (84-89)
  • WorkClient (46-56)
  • OperatorClient (33-43)
fleetconfig-controller/internal/kube/kube.go (3)
  • KubeconfigFromSecretOrCluster (113-119)
  • RawFromInClusterRestConfig (75-81)
  • RestConfigFromKubeconfig (22-32)
fleetconfig-controller/internal/hash/hash.go (1)
  • ComputeHash (11-22)
fleetconfig-controller/internal/args/args.go (2)
  • PrepareResources (32-55)
  • PrepareKubeconfig (15-29)
fleetconfig-controller/internal/exec/exec.go (1)
  • CmdWithLogs (18-54)
fleetconfig-controller/internal/version/version.go (2)
  • LowestBundleVersion (15-46)
  • Normalize (49-55)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (8)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (2)

674-681: Redact the hub token before logging clusteradm args.

Line 674 still emits joinArgs with --hub-token <secret>, leaking credentials to the controller logs—the same issue we previously flagged. Please mask that entry before logging.

-	logger.V(1).Info("clusteradm join", "args", joinArgs)
+	maskedArgs := make([]string, len(joinArgs))
+	copy(maskedArgs, joinArgs)
+	for i := 0; i < len(maskedArgs)-1; i++ {
+		if maskedArgs[i] == "--hub-token" {
+			maskedArgs[i+1] = "REDACTED"
+			break
+		}
+	}
+	logger.V(1).Info("clusteradm join", "args", maskedArgs)

900-901: Stop logging the token payload.

clusteradm get token writes the hub join token JSON to stdout; logging string(stdout) at Line 900 still exposes that secret verbatim.

-	logger.V(1).Info("got join token", "output", string(stdout))
+	logger.V(1).Info("got join token")
fleetconfig-controller/internal/webhook/v1beta1/validation_test.go (3)

690-701: Fix the expected message for unsupported hub add-ons.

When the validator sees "custom-addon" it returns invalid hubAddOn name. must be one of ..., not the clash message currently asserted here. The test only “passes” because the helper compares string lengths. Please update the expectation to match the real error.

-			errMsgs:  []string{"hubAddOn name argocd clashes with an existing addOnConfig name"},
+			errMsgs:  []string{"invalid hubAddOn name. must be one of"},

703-724: Correct the expected clash strings in the combined-conflicts scenario.

This case still reports clashes on argocd, yet the expected messages mention shared-addon. The mismatch is hidden by the faulty assertion helper—please update the strings.

-				"hubAddOn name shared-addon clashes with an existing addOnConfig name",
-				"hubAddOn name shared-addon clashes with an existing addOnConfig name",
+				"hubAddOn name argocd clashes with an existing addOnConfig name",
+				"hubAddOn name argocd clashes with an existing addOnConfig name",

753-770: Replace the length check with a real substring match.

Lines 757-764 still treat len(actual) >= len(expected) as a “partial match”, so any longer error satisfies the assertion regardless of content. Switch to strings.Contains (and import strings) so the tests actually validate the message.

-import (
-	"testing"
+import (
+	"strings"
+	"testing"
@@
-					if err.Error() != "" && err.Error() != expectedMsg {
-						// For partial matching since error messages include indices
-						if len(expectedMsg) > 0 && err.Error() != "" {
-							// Check if the error message contains the expected substring
-							if len(err.Error()) >= len(expectedMsg) {
-								found = true
-								break
-							}
-						}
-					} else if err.Error() == expectedMsg {
+					if err.Error() == expectedMsg || (expectedMsg != "" && err.Error() != "" && strings.Contains(err.Error(), expectedMsg)) {
 						found = true
 						break
 					}
fleetconfig-controller/api/v1beta1/spoke_types.go (2)

101-104: Solid hub-as-spoke detection guard

Checking both the canonical hub-as-spoke name and the InCluster kubeconfig flag keeps the special-case handling aligned with the new cluster typing story without introducing extra state.


106-114: PivotComplete now mirrors the observed condition flow

Requiring SpokeJoined before honoring PivotComplete matches the reconciler’s readiness semantics and avoids premature “day 2” status flips.

fleetconfig-controller/internal/webhook/v1beta1/validation.go (1)

370-404: Add-on validation now respects hub-as-spoke semantics and preserves hard failures

Blocking FCC enablement on hub-as-spoke spokes, requiring it elsewhere, and returning accumulated errors even when hub discovery falls back to a warning closes the loophole where bad specs slipped through on missing hubs. Nicely handled.

Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
fleetconfig-controller/devspace.yaml (4)

163-171: Narrow imageSelector and make port mapping explicit for hub dev mapping

To avoid matching unintended pods and to be explicit about port-forwarding, include the tag and map local:remote.

Apply this diff:

-    imageSelector: ${IMAGE_REPOSITORY}
+    imageSelector: ${IMAGE_REPOSITORY}:dev
@@
-    - port: ${HUB_PORT}
+    - port: ${HUB_PORT}:${HUB_PORT}

Optional: add a labelSelector to target only the hub Deployment if hub/spoke share the same image.


189-198: Mirror the imageSelector/port improvements for spoke dev mapping

Same rationale as hub: be explicit and narrow the match.

Apply this diff:

-    imageSelector: ${IMAGE_REPOSITORY}
+    imageSelector: ${IMAGE_REPOSITORY}:dev
@@
-    - port: ${SPOKE_PORT}
+    - port: ${SPOKE_PORT}:${SPOKE_PORT}

Optional: add a labelSelector targeting only the spoke Deployment.

Also applies to: 213-213


12-15: Confirmed port bindings in start scripts

devspace-start-hub.sh listens on 2344 and devspace-start-spoke.sh on 2345. Consider replacing these literals with $HUB_PORT/$SPOKE_PORT for maintainability.


58-66: Add readiness waits & verify debug pipeline targets

  • In fleetconfig-controller/devspace.yaml’s debug and debug-spoke pipelines, prepend:

    kubectl -n cert-manager wait deploy -l app.kubernetes.io/name=cert-manager --for=condition=Available --timeout=120s

  • Confirm both pipelines deploy into fleetconfig-system and use the fleetconfig-controller-manager deployment name/labels when invoking create_deployments.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e511abd and 93e3be6.

📒 Files selected for processing (4)
  • fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml (1 hunks)
  • fleetconfig-controller/devspace.yaml (7 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (9 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (1 hunks)
🧰 Additional context used
🧠 Learnings (12)
📓 Common learnings
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.
📚 Learning: 2025-09-25T23:18:41.545Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.545Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-22T19:16:34.109Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/webhook/v1beta1/validation.go:103-121
Timestamp: 2025-09-22T19:16:34.109Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller v1beta1 API, the Klusterlet field in SpokeSpec is defined as a struct value (Klusterlet Klusterlet), not a pointer (*Klusterlet), so direct field access like Klusterlet.Annotations is safe without nil checks. The Klusterlet struct does not contain a Source field.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-22T18:42:03.404Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/PROJECT:28-31
Timestamp: 2025-09-22T18:42:03.404Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller, the PROJECT file defines multiple API resources with different webhook configurations: FleetConfig v1alpha1 has defaulting: true (requiring MutatingWebhookConfiguration), while Hub and Spoke v1beta1 resources have defaulting: false. MutatingWebhookConfiguration resources in the manifests serve the v1alpha1 FleetConfig, not the v1beta1 Hub/Spoke resources.

Applied to files:

  • fleetconfig-controller/devspace.yaml
📚 Learning: 2025-09-22T19:26:11.020Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/test/data/fleetconfig-v1alpha1.yaml:47-53
Timestamp: 2025-09-22T19:26:11.020Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller tests, the kubeconfigKey is intentionally set to "value" in test fixtures (fleetconfig-v1alpha1.yaml, fleetconfig-values.yaml) because that's how the test harness provisions the kubeconfig secret during test setup. This differs from the chart default of "kubeconfig" but is correct for the test environment.

Applied to files:

  • fleetconfig-controller/devspace.yaml
📚 Learning: 2025-08-27T21:58:32.141Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#58
File: fleetconfig-controller/charts/fleetconfig-controller/README.md:155-155
Timestamp: 2025-08-27T21:58:32.141Z
Learning: In the open-cluster-management-io/lab repository, the fleetconfig-controller follows a workflow where chart version bumps (in README.md and values.yaml) are included in PRs before the corresponding Docker image exists. The Docker image is built and pushed automatically via GitHub release workflows after the PR is merged and tagged, making the referenced version available.

Applied to files:

  • fleetconfig-controller/devspace.yaml
📚 Learning: 2025-09-25T23:31:11.561Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml:110-112
Timestamp: 2025-09-25T23:31:11.561Z
Learning: The fleetconfig-controller-manager spoke agent requires create/update/patch/delete permissions on CustomResourceDefinitions because `clusteradm upgrade klusterlet` operations need create/update permissions and cleanup operations require delete permissions for proper lifecycle management.

Applied to files:

  • fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
🧬 Code graph analysis (2)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (2)
fleetconfig-controller/api/v1beta1/constants.go (8)
  • ClusterTypeHub (76-76)
  • HubCleanupFinalizer (10-10)
  • HubCleanupPreflightFinalizer (7-7)
  • SpokeCleanupFinalizer (13-13)
  • ClusterTypeSpoke (79-79)
  • SupportedClusterTypes (110-113)
  • PivotComplete (31-31)
  • SpokeNameEnvVar (88-88)
fleetconfig-controller/cmd/manager/manager.go (1)
  • Options (33-45)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (9)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
  • SpokeReconciler (46-52)
fleetconfig-controller/api/v1beta1/spoke_types.go (5)
  • Spoke (291-305)
  • KlusterletChartConfig (196-198)
  • HubRef (86-94)
  • AddOn (251-263)
  • Klusterlet (117-179)
fleetconfig-controller/api/v1beta1/constants.go (17)
  • ClusterTypeHub (76-76)
  • ClusterTypeSpoke (79-79)
  • SupportedClusterTypes (110-113)
  • SpokeJoined (28-28)
  • LabelManagedClusterType (118-118)
  • ManagedClusterTypeHubAsSpoke (71-71)
  • ControllerNamespaceEnvVar (97-97)
  • HubNamespaceEnvVar (94-94)
  • SpokeNamespaceEnvVar (91-91)
  • AddonsConfigured (22-22)
  • RoleNameEnvVar (100-100)
  • DefaultFCCManagerRole (106-106)
  • FCCAddOnName (103-103)
  • HubCleanupPreflightFinalizer (7-7)
  • SpokeCleanupFinalizer (13-13)
  • HubCleanupFinalizer (10-10)
  • HubInitialized (19-19)
fleetconfig-controller/pkg/common/common.go (6)
  • ClusterClient (20-30)
  • AddOnClient (59-69)
  • GetManagedCluster (72-81)
  • UpdateManagedCluster (84-89)
  • WorkClient (46-56)
  • OperatorClient (33-43)
fleetconfig-controller/internal/kube/kube.go (3)
  • KubeconfigFromSecretOrCluster (113-119)
  • RawFromInClusterRestConfig (75-81)
  • RestConfigFromKubeconfig (22-32)
fleetconfig-controller/internal/hash/hash.go (1)
  • ComputeHash (11-22)
fleetconfig-controller/internal/args/args.go (2)
  • PrepareResources (32-55)
  • PrepareKubeconfig (15-29)
fleetconfig-controller/internal/exec/exec.go (1)
  • CmdWithLogs (18-54)
fleetconfig-controller/internal/version/version.go (2)
  • LowestBundleVersion (15-46)
  • Normalize (49-55)
🪛 YAMLlint (1.37.1)
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml

[error] 19-19: syntax error: expected , but found ''

(syntax)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (10)
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml (1)

19-20: Quote Helm-built names to unblock YAML lint

Unquoted concatenations like name: {{ include "chart.fullname" . }}-manager are still tripping YAMLlint (expected <block end>, but found '<scalar>'). Please wrap every similar name, serviceAccountName, and clusterRoleName occurrence in quotes (e.g. {{ printf "%s-manager" (include "chart.fullname" .) | quote }}) to render valid YAML. This applies here and at Lines 82-83, 116-120, 128-152, and 163-163.

Apply this diff (excerpt):

-          name: {{ include "chart.fullname" . }}-manager
+          name: {{ printf "%s-manager" (include "chart.fullname" .) | quote }}
@@
-          name: {{ include "chart.fullname" . }}-manager
+          name: {{ printf "%s-manager" (include "chart.fullname" .) | quote }}
@@
-          name: {{ include "chart.fullname" . }}-manager-spoke-rolebinding
+          name: {{ printf "%s-manager-spoke-rolebinding" (include "chart.fullname" .) | quote }}
@@
-          name: {{ include "chart.fullname" . }}-manager-spoke-role
+          name: {{ printf "%s-manager-spoke-role" (include "chart.fullname" .) | quote }}
@@
-          name: {{ include "chart.fullname" . }}-leader-election-role
+          name: {{ printf "%s-leader-election-role" (include "chart.fullname" .) | quote }}
@@
-          name: {{ include "chart.fullname" . }}-leader-election-rolebinding
+          name: {{ printf "%s-leader-election-rolebinding" (include "chart.fullname" .) | quote }}
@@
-              clusterRoleName: {{ include "chart.fullname" . }}-manager-role
+              clusterRoleName: {{ printf "%s-manager-role" (include "chart.fullname" .) | quote }}
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)

250-256: Validate CLUSTER_NAME before adding the predicate

If this env var is empty we scope the controller to a zero-length name, so it never reconciles anything. Please fail fast instead of assuming other wiring already validated it.

 func (r *SpokeReconciler) SetupWithManagerForSpoke(mgr ctrl.Manager) error {
-	spokeName := os.Getenv(v1beta1.SpokeNameEnvVar) // we know this is set, because the mgr setup would have failed otherwise
+	spokeName := strings.TrimSpace(os.Getenv(v1beta1.SpokeNameEnvVar))
+	if spokeName == "" {
+		return fmt.Errorf("env %s must be set to the Spoke name", v1beta1.SpokeNameEnvVar)
+	}
 	return ctrl.NewControllerManagedBy(mgr).

Don't forget to add strings to the imports.

fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (4)

305-313: Guard optional ClusterManager before dereferencing

hub.Spec.ClusterManager is optional in the API; if it’s nil this panics the controller. Please nil-check (and reuse a local source variable) before touching .Source.

-	if hub != nil && hub.Spec.ClusterManager.Source.BundleVersion != "" {
-		upgrade, err := r.spokeNeedsUpgrade(ctx, spoke, currKlusterletHash, hub.Spec.ClusterManager.Source, spokeKubeconfig)
+	if hub != nil && hub.Spec.ClusterManager != nil && hub.Spec.ClusterManager.Source.BundleVersion != "" {
+		source := hub.Spec.ClusterManager.Source
+		upgrade, err := r.spokeNeedsUpgrade(ctx, spoke, currKlusterletHash, source, spokeKubeconfig)
 		if err != nil {
 			return fmt.Errorf("failed to check if spoke cluster needs upgrade: %w", err)
 		}
 
 		if upgrade {
-			if err := r.upgradeSpoke(ctx, spoke, klusterletValues, hub.Spec.ClusterManager.Source, spokeKubeconfig); err != nil {
+			if err := r.upgradeSpoke(ctx, spoke, klusterletValues, source, spokeKubeconfig); err != nil {

590-592: Handle missing ClusterManager config before building join args

We still dereference hub.Spec.ClusterManager without a guard. When that block is omitted (it’s optional) the join path will panic. Please detect the nil case and return a clear error instead of crashing.

-		"--bundle-version", hub.Spec.ClusterManager.Source.BundleVersion,
-		"--image-registry", hub.Spec.ClusterManager.Source.Registry,
+	}, spoke.BaseArgs()...)
+
+	if hub.Spec.ClusterManager == nil {
+		return fmt.Errorf("hub %s/%s is missing clusterManager configuration", hub.Namespace, hub.Name)
+	}
+	source := hub.Spec.ClusterManager.Source
+	joinArgs = append(joinArgs,
+		"--bundle-version", source.BundleVersion,
+		"--image-registry", source.Registry,

684-685: Redact the hub token before logging join args

joinArgs includes --hub-token <secret>, so this log leaks credentials. Please mask or drop the token argument before logging.

-	logger.V(1).Info("clusteradm join", "args", joinArgs)
+	maskedArgs := make([]string, len(joinArgs))
+	copy(maskedArgs, joinArgs)
+	for i := 0; i < len(maskedArgs)-1; i++ {
+		if maskedArgs[i] == "--hub-token" {
+			maskedArgs[i+1] = "REDACTED"
+			break
+		}
+	}
+	logger.V(1).Info("clusteradm join", "args", maskedArgs)

910-911: Stop logging the raw join token payload

string(stdout) contains the token JSON, so this log also exposes credentials. Please drop the payload or replace it with an opaque message.

-	logger.V(1).Info("got join token", "output", string(stdout))
+	logger.V(1).Info("got join token")
fleetconfig-controller/devspace.yaml (4)

42-42: Dev pipeline now targets hub dev mapping — LGTM

Scoped start_dev reduces accidental spoke attachment; good change.


114-116: Chart wiring for fleetConfig.enabled — LGTM

Profiles flip this via vars; consistent with prior workflow.


172-188: Sync whitelist for hub — LGTM

Tight include set minimizes sync churn; inclusion of devspace-start-hub.sh is correct.


137-144: Resolve: debug manifest directories exist
Both fleetconfig-controller/config/devspace/hub and fleetconfig-controller/config/devspace/spoke are present.

Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
fleetconfig-controller/api/v1beta1/spoke_types.go (1)

92-106: Polish the CleanupConfig doc comments.

Nitpicks: fix “it’s” → “its” and drop the duplicate “will” so the comments read cleanly.

-	// If true, the agent will attempt to garbage collect it's own namespace after the spoke cluster is unjoined.
+	// If true, the agent will attempt to garbage collect its own namespace after the spoke cluster is unjoined.
@@
-	// If set, the kubeconfig secret will will be automatically deleted after the agent has taken over managing the Spoke.
+	// If set, the kubeconfig secret will be automatically deleted after the agent has taken over managing the Spoke.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 90ca883 and d9a9dd3.

📒 Files selected for processing (4)
  • fleetconfig-controller/api/v1beta1/constants.go (4 hunks)
  • fleetconfig-controller/api/v1beta1/spoke_types.go (2 hunks)
  • fleetconfig-controller/charts/fleetconfig-controller/templates/deployment-dev.yaml (1 hunks)
  • fleetconfig-controller/config/devspace/hub/manager.yaml (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • fleetconfig-controller/charts/fleetconfig-controller/templates/deployment-dev.yaml
  • fleetconfig-controller/config/devspace/hub/manager.yaml
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/PROJECT:28-31
Timestamp: 2025-09-22T18:42:03.404Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller, the PROJECT file defines multiple API resources with different webhook configurations: FleetConfig v1alpha1 has defaulting: true (requiring MutatingWebhookConfiguration), while Hub and Spoke v1beta1 resources have defaulting: false. MutatingWebhookConfiguration resources in the manifests serve the v1alpha1 FleetConfig, not the v1beta1 Hub/Spoke resources.
📚 Learning: 2025-09-25T23:18:41.573Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.573Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.

Applied to files:

  • fleetconfig-controller/api/v1beta1/spoke_types.go
  • fleetconfig-controller/api/v1beta1/constants.go
📚 Learning: 2025-09-22T19:16:34.109Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/webhook/v1beta1/validation.go:103-121
Timestamp: 2025-09-22T19:16:34.109Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller v1beta1 API, the Klusterlet field in SpokeSpec is defined as a struct value (Klusterlet Klusterlet), not a pointer (*Klusterlet), so direct field access like Klusterlet.Annotations is safe without nil checks. The Klusterlet struct does not contain a Source field.

Applied to files:

  • fleetconfig-controller/api/v1beta1/spoke_types.go
🧬 Code graph analysis (1)
fleetconfig-controller/api/v1beta1/spoke_types.go (2)
fleetconfig-controller/api/v1beta1/constants.go (3)
  • ManagedClusterTypeHubAsSpoke (77-77)
  • PivotComplete (31-31)
  • SpokeJoined (28-28)
fleetconfig-controller/api/v1beta1/common.go (1)
  • Kubeconfig (13-29)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests

Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)

158-194: Pass the hub kubeconfig into clusteradm accept.

clusteradm accept is invoked without --kubeconfig, so it falls back to ~/.kube/config. Inside the controller pod that file (and KUBECONFIG) are absent, which makes every accept attempt fail. Hand the hub kubeconfig/context to acceptCluster (like we already do for join/get token) and plumb it through so the command always targets the hub API.

🧹 Nitpick comments (2)
fleetconfig-controller/internal/watch/watch.go (2)

51-68: Consider adding panic recovery to prevent manager crashes.

The Start method implements the manager.Runnable pattern correctly. However, if the check method or the handler function panics, Start will panic and crash the manager. Consider adding a recover() block in the ticker case to log panics and continue the loop, improving resilience.

Apply this diff to add panic recovery:

 		case <-ticker.C:
+			func() {
+				defer func() {
+					if r := recover(); r != nil {
+						w.log.Error(fmt.Errorf("panic: %v", r), "Watch check panicked", "name", w.name)
+					}
+				}()
 				if err := w.check(ctx); err != nil {
 					w.log.Error(err, "Watch check failed", "name", w.name)
 				}
+			}()
 		}

Note: You'll need to import "fmt" for this change.


70-82: Consider adding timeouts for condition and handler execution.

The check method correctly evaluates the condition and invokes the handler. However, if the condition or handler functions are slow or block indefinitely, this could delay subsequent ticks or cause the watcher to become unresponsive. Consider wrapping the condition and handler calls with a context timeout to prevent long-running operations from blocking the watcher.

Example:

func (w *ResourceWatcher) check(ctx context.Context) error {
	// Add timeout for condition check
	conditionCtx, cancel := context.WithTimeout(ctx, 30*time.Second)
	defer cancel()
	
	met, err := w.condition(conditionCtx, w.client)
	if err != nil {
		return err
	}

	if !met {
		return nil
	}

	w.log.V(1).Info("Condition met, executing handler", "name", w.name)
	
	// Add timeout for handler execution
	handlerCtx, cancel := context.WithTimeout(ctx, 30*time.Second)
	defer cancel()
	
	return w.handler(handlerCtx, w.client)
}

You may want to make the timeout configurable via the Config struct.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d9a9dd3 and 65fac35.

📒 Files selected for processing (8)
  • fleetconfig-controller/api/v1beta1/constants.go (4 hunks)
  • fleetconfig-controller/api/v1beta1/spoke_types.go (2 hunks)
  • fleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yaml (2 hunks)
  • fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml (1 hunks)
  • fleetconfig-controller/config/devspace/spoke/manager.yaml (1 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (10 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (1 hunks)
  • fleetconfig-controller/internal/watch/watch.go (1 hunks)
🧰 Additional context used
🧠 Learnings (9)
📓 Common learnings
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/PROJECT:28-31
Timestamp: 2025-09-22T18:42:03.404Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller, the PROJECT file defines multiple API resources with different webhook configurations: FleetConfig v1alpha1 has defaulting: true (requiring MutatingWebhookConfiguration), while Hub and Spoke v1beta1 resources have defaulting: false. MutatingWebhookConfiguration resources in the manifests serve the v1alpha1 FleetConfig, not the v1beta1 Hub/Spoke resources.
📚 Learning: 2025-09-25T23:18:41.573Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.573Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.

Applied to files:

  • fleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yaml
  • fleetconfig-controller/api/v1beta1/spoke_types.go
  • fleetconfig-controller/api/v1beta1/constants.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.

Applied to files:

  • fleetconfig-controller/api/v1beta1/spoke_types.go
📚 Learning: 2025-09-22T19:16:34.109Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/webhook/v1beta1/validation.go:103-121
Timestamp: 2025-09-22T19:16:34.109Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller v1beta1 API, the Klusterlet field in SpokeSpec is defined as a struct value (Klusterlet Klusterlet), not a pointer (*Klusterlet), so direct field access like Klusterlet.Annotations is safe without nil checks. The Klusterlet struct does not contain a Source field.

Applied to files:

  • fleetconfig-controller/api/v1beta1/spoke_types.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-25T23:31:11.630Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml:110-112
Timestamp: 2025-09-25T23:31:11.630Z
Learning: The fleetconfig-controller-manager spoke agent requires create/update/patch/delete permissions on CustomResourceDefinitions because `clusteradm upgrade klusterlet` operations need create/update permissions and cleanup operations require delete permissions for proper lifecycle management.

Applied to files:

  • fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml
🧬 Code graph analysis (2)
fleetconfig-controller/api/v1beta1/spoke_types.go (2)
fleetconfig-controller/api/v1beta1/constants.go (3)
  • ManagedClusterTypeHubAsSpoke (77-77)
  • PivotComplete (31-31)
  • SpokeJoined (28-28)
fleetconfig-controller/api/v1beta1/common.go (1)
  • Kubeconfig (13-29)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (9)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
  • SpokeReconciler (51-57)
fleetconfig-controller/api/v1beta1/spoke_types.go (6)
  • Spoke (309-323)
  • KlusterletChartConfig (214-216)
  • HubRef (110-118)
  • CleanupConfig (91-107)
  • AddOn (269-281)
  • Klusterlet (141-197)
fleetconfig-controller/api/v1beta1/constants.go (23)
  • InstanceTypeManager (83-83)
  • PivotComplete (31-31)
  • InstanceTypeUnified (89-89)
  • InstanceTypeAgent (86-86)
  • SupportedInstanceTypes (135-139)
  • KlusterletSynced (34-34)
  • SpokeJoined (28-28)
  • LabelManagedClusterType (151-151)
  • ManagedClusterTypeHubAsSpoke (77-77)
  • FCCAddOnName (116-116)
  • ControllerNamespaceEnvVar (107-107)
  • HubNamespaceEnvVar (104-104)
  • SpokeNamespaceEnvVar (101-101)
  • PurgeAgentNamespaceEnvVar (113-113)
  • AddonsConfigured (22-22)
  • HubCleanupPreflightFinalizer (7-7)
  • SpokeCleanupFinalizer (13-13)
  • HubCleanupFinalizer (10-10)
  • OCMSpokeNamespaces (142-146)
  • HubInitialized (19-19)
  • AWSIRSARegistrationDriver (163-163)
  • BundleVersionDefault (215-215)
  • BundleVersionLatest (212-212)
fleetconfig-controller/pkg/common/common.go (6)
  • ClusterClient (20-30)
  • AddOnClient (59-69)
  • GetManagedCluster (72-81)
  • UpdateManagedCluster (84-89)
  • WorkClient (46-56)
  • OperatorClient (33-43)
fleetconfig-controller/internal/kube/kube.go (3)
  • KubeconfigFromSecretOrCluster (113-119)
  • RestConfigFromKubeconfig (22-32)
  • RawFromInClusterRestConfig (75-81)
fleetconfig-controller/internal/hash/hash.go (1)
  • ComputeHash (11-22)
fleetconfig-controller/internal/args/args.go (2)
  • PrepareResources (32-55)
  • PrepareKubeconfig (15-29)
fleetconfig-controller/internal/exec/exec.go (1)
  • CmdWithLogs (18-54)
fleetconfig-controller/internal/version/version.go (2)
  • LowestBundleVersion (15-46)
  • Normalize (49-55)
🪛 Checkov (3.2.334)
fleetconfig-controller/config/devspace/spoke/manager.yaml

[medium] 1-58: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 1-58: Minimize the admission of root containers

(CKV_K8S_23)

🪛 YAMLlint (1.37.1)
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml

[error] 2-2: syntax error: expected the node content, but found '-'

(syntax)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
🔇 Additional comments (3)
fleetconfig-controller/internal/watch/watch.go (3)

1-10: LGTM!

Package declaration and imports are appropriate for a periodic resource watcher implementing manager.Runnable.


12-17: LGTM!

Function type signatures follow idiomatic controller-runtime patterns with proper context propagation.


19-27: LGTM!

Struct encapsulation is correct with all fields unexported, preventing external mutation.

Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)

268-299: CLUSTER_NAME validation is missing in manager setup.

Line 269 assumes CLUSTER_NAME is validated during manager setup, but as previously identified, manager.go only validates CLUSTER_NAMESPACE (SpokeNamespaceEnvVar) and does not check CLUSTER_NAME (SpokeNameEnvVar). Without this validation, the controller can start with an empty spokeName, causing the predicate filter to silently match nothing and leaving the controller inert.

Ensure manager.go validates that os.Getenv(v1beta1.SpokeNameEnvVar) is non-empty before calling SetupWithManagerForSpoke.

🧹 Nitpick comments (2)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)

228-265: Consider whether Hub status changes should trigger Spoke reconciliation.

The Hub watch predicate now triggers reconciliation when Hub.Status changes (line 256). Hub status updates typically don't affect Spoke configuration, and watching status can cause unnecessary reconciles if the status changes frequently.

Consider removing the status comparison if Hub status changes don't require Spoke updates:

-					return sharedFieldsChanged(oldHub.Spec.DeepCopy(), newHub.Spec.DeepCopy()) ||
-						!reflect.DeepEqual(oldHub.Status, newHub.Status)
+					return sharedFieldsChanged(oldHub.Spec.DeepCopy(), newHub.Spec.DeepCopy())

If Hub status changes do affect Spoke behavior, please add a comment explaining the dependency.

fleetconfig-controller/internal/watch/watch.go (1)

87-109: Consider running the first check immediately on start

Right now the watcher waits a full interval before the first evaluation. That can delay handlers noticeably when the condition is already true at startup (common for readiness/cleanup flows). You could reuse the same panic guard by hoisting it into a small helper and invoking it once before entering the loop, e.g.:

@@
-	for {
-		select {
-		case <-ctx.Done():
+	run := func() {
+		defer func() {
+			if r := recover(); r != nil {
+				w.log.Error(fmt.Errorf("panic: %v", r), "Watch check panicked", "name", w.name)
+			}
+		}()
+		if err := w.check(ctx); err != nil {
+			w.log.Error(err, "Watch check failed", "name", w.name)
+		}
+	}
+
+	run()
+
+	for {
+		select {
+		case <-ctx.Done():
 			w.log.Info("Shutting down resource watcher", "name", w.name)
 			return nil
-		case <-ticker.C:
-			func() {
-				defer func() {
-					if r := recover(); r != nil {
-						w.log.Error(fmt.Errorf("panic: %v", r), "Watch check panicked", "name", w.name)
-					}
-				}()
-				if err := w.check(ctx); err != nil {
-					w.log.Error(err, "Watch check failed", "name", w.name)
-				}
-			}()
+		case <-ticker.C:
+			run()
 		}
 	}

Keeps the existing behaviour on ticks while removing the initial delay.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 65fac35 and 0a5c147.

📒 Files selected for processing (2)
  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (10 hunks)
  • fleetconfig-controller/internal/watch/watch.go (1 hunks)
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-09-25T23:18:41.573Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.573Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
🧬 Code graph analysis (1)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (6)
fleetconfig-controller/api/v1beta1/constants.go (14)
  • InstanceTypeManager (83-83)
  • HubCleanupFinalizer (10-10)
  • HubCleanupPreflightFinalizer (7-7)
  • SpokeCleanupFinalizer (13-13)
  • InstanceTypeUnified (89-89)
  • InstanceTypeAgent (86-86)
  • SupportedInstanceTypes (135-139)
  • PivotComplete (31-31)
  • KlusterletSynced (34-34)
  • SpokeNameEnvVar (98-98)
  • AgentCleanupWatcherName (131-131)
  • SpokeNamespaceEnvVar (101-101)
  • PurgeAgentNamespaceEnvVar (113-113)
  • ControllerNamespaceEnvVar (107-107)
fleetconfig-controller/api/v1beta1/common.go (2)
  • NewCondition (112-123)
  • Condition (126-129)
fleetconfig-controller/internal/watch/watch.go (3)
  • NewOrDie (78-84)
  • Config (35-43)
  • New (46-76)
fleetconfig-controller/cmd/manager/manager.go (1)
  • Options (33-45)
fleetconfig-controller/internal/kube/kube.go (2)
  • RawFromInClusterRestConfig (75-81)
  • RestConfigFromKubeconfig (22-32)
fleetconfig-controller/pkg/common/common.go (1)
  • WorkClient (46-56)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (6)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (5)

57-57: LGTM! InstanceType field appropriately added.

The InstanceType field enables instance-aware reconciliation logic, aligning with the new hub/spoke architecture. The field is properly utilized throughout the reconcile flow.


141-149: LGTM! Cleanup logic correctly gates on HubCleanupFinalizer.

The deletion flow properly checks for HubCleanupFinalizer before executing cleanup. Based on learnings, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer, so this check is sufficient regardless of cluster type.

Based on learnings.


170-172: Verify the semantic intent of KlusterletSynced wantStatus.

The KlusterletSynced condition is initialized with WantStatus: metav1.ConditionFalse, which is unusual. Typically, conditions should eventually reach a True state to indicate success. Please confirm:

  • Is KlusterletSynced intended to represent an error/failure state (where True means out-of-sync)?
  • Or should the WantStatus be metav1.ConditionTrue to indicate successful synchronization?

If the current behavior is correct, consider adding a comment explaining why WantStatus=False is the desired steady state.


301-312: LGTM! Spoke deletion condition logic is correct.

The spokeDeletedCondition function correctly returns true when the Spoke resource is not found, indicating successful deletion. Error handling appropriately ignores NotFound errors.


314-348: LGTM! Agent self-destruct logic uses safe defaults.

The agentSelfDestruct function implements appropriate cleanup with defensive error handling:

  • Falls back to orphaning the namespace (less destructive) if PURGE_AGENT_NAMESPACE parsing fails
  • Only deletes the agent namespace when explicitly enabled
  • Properly cleans up AppliedManifestWorks

Consider documenting the PURGE_AGENT_NAMESPACE environment variable and its default behavior in user-facing documentation to clarify the namespace cleanup behavior.

fleetconfig-controller/internal/watch/watch.go (1)

45-75: Validation guardrail looks solid

Good call returning an error when required Config fields are missing; this closes the earlier gap and makes New safer to consume.

Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
fleetconfig-controller/internal/watch/watch.go (1)

45-76: Consider validating or documenting the Name field.

The constructor validates all critical fields, but cfg.Name is not validated. While an empty name won't cause runtime errors, it could make logs less useful for distinguishing multiple watchers. Consider either validating that Name is non-empty or documenting that it's optional.

Apply this diff to add Name validation if it should be required:

 	if cfg.Interval <= 0 {
 		return nil, errors.New("watch.Config.Interval must be positive")
 	}
+	if cfg.Name == "" {
+		return nil, errors.New("watch.Config.Name must not be empty")
+	}
 	timeout := cfg.Timeout

Alternatively, document in the Config struct comment that Name is optional.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0a5c147 and d28fd65.

📒 Files selected for processing (1)
  • fleetconfig-controller/internal/watch/watch.go (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (6)
fleetconfig-controller/internal/watch/watch.go (6)

1-12: LGTM!

Package declaration and imports are appropriate for a generic watcher implementation.


14-14: LGTM!

The default timeout of 10 seconds is reasonable for watch operations.


16-43: LGTM!

Type definitions are well-structured with appropriate encapsulation. Function types provide clear contracts for condition checking and handler execution.


78-85: LGTM!

The NewOrDie constructor correctly implements the panic-on-error pattern with helpful error context.


87-111: LGTM!

The Start method correctly implements the manager.Runnable interface with proper resource cleanup, context cancellation handling, and panic recovery. Logging errors and panics without stopping the watcher is appropriate for a resilient long-running component.


113-131: LGTM!

The check method properly manages timeout contexts for both condition evaluation and handler execution. Separate timeouts for each operation provide good isolation and resource control.

Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
fleetconfig-controller/internal/controller/v1beta1/addon.go (2)

410-413: ****

This duplicates a prior review comment. The MCA may not exist immediately after enable due to eventual consistency. Add retry logic with wait.PollUntilContextTimeout to handle NotFound errors gracefully.

Based on past review comments.


424-428: ****

This duplicates a prior review comment. The idempotency check accesses fields incorrectly. AddOnConfig embeds ConfigGroupResource and ConfigReferent, so you must access c.ConfigGroupResource.Group, c.ConfigGroupResource.Resource, c.ConfigReferent.Name, and c.ConfigReferent.Namespace instead of c.Group, c.Resource, c.Name, c.Namespace.

Based on past review comments.

🧹 Nitpick comments (2)
fleetconfig-controller/internal/controller/v1beta1/addon.go (2)

702-713: Strengthen validation when !shouldCleanAll.

When expectedWorks == 1 and !shouldCleanAll, the logic checks if the remaining ManifestWork is the FCC addon. However, if the label is missing (ok == false), the function continues polling indefinitely rather than failing fast. Consider returning an error if the remaining work is not the expected FCC addon.

Apply this diff to fail fast on unexpected remaining works:

 		if len(manifestWorks.Items) == expectedWorks {
 			if shouldCleanAll {
 				logger.V(1).Info("addon manifestWorks cleanup completed", "spokeName", spokeName, "remainingManifestWorks", len(manifestWorks.Items))
 				return true, nil
 			}
 			mw := manifestWorks.Items[0]
 			val, ok := mw.Labels[addonv1alpha1.AddonLabelKey]
-			if ok && val == v1beta1.FCCAddOnName {
+			if !ok || val != v1beta1.FCCAddOnName {
+				return false, fmt.Errorf("unexpected remaining ManifestWork: expected %s, got label=%q (ok=%t)", v1beta1.FCCAddOnName, val, ok)
+			}
-				logger.V(1).Info("addon manifestWorks cleanup completed", "spokeName", spokeName, "remainingManifestWork", mw.Name)
-				return true, nil
-			}
+			logger.V(1).Info("addon manifestWorks cleanup completed", "spokeName", spokeName, "remainingManifestWork", mw.Name)
+			return true, nil
 		}

792-810: Consider using strategic merge patch for idempotency.

The current logic overwrites the entire RoleBinding on update, which could cause issues if other controllers modify the binding. Using a strategic merge patch would be more robust and idempotent.

Apply this diff to use patch instead of update:

 	err := r.Create(ctx, binding, &client.CreateOptions{})
 	if err != nil {
 		if !kerrs.IsAlreadyExists(err) {
 			logger.Error(err, "failed to create role binding for addon")
 			return err
 		}
-		curr := &rbacv1.RoleBinding{}
-		err = r.Get(ctx, types.NamespacedName{Namespace: binding.Namespace, Name: binding.Name}, curr)
-		if err != nil {
-			logger.Error(err, "failed to get role binding for addon")
-			return err
-		}
-		binding.SetResourceVersion(curr.ResourceVersion)
-		err = r.Update(ctx, binding)
-		if err != nil {
-			logger.Error(err, "failed to update role binding for addon")
-			return err
-		}
+		// Use strategic merge patch to update only the fields we care about
+		patch := client.MergeFrom(&rbacv1.RoleBinding{
+			ObjectMeta: metav1.ObjectMeta{
+				Name:      binding.Name,
+				Namespace: binding.Namespace,
+			},
+		})
+		if err = r.Patch(ctx, binding, patch); err != nil {
+			logger.Error(err, "failed to patch role binding for addon")
+			return err
+		}
 	}
 	return nil
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d28fd65 and 891565d.

📒 Files selected for processing (8)
  • fleetconfig-controller/internal/args/args.go (2 hunks)
  • fleetconfig-controller/internal/args/args_test.go (1 hunks)
  • fleetconfig-controller/internal/controller/v1alpha1/addon.go (6 hunks)
  • fleetconfig-controller/internal/controller/v1alpha1/hub.go (4 hunks)
  • fleetconfig-controller/internal/controller/v1alpha1/spoke.go (7 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/addon.go (15 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/hub_controller.go (7 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (1 hunks)
🧰 Additional context used
🧠 Learnings (7)
📚 Learning: 2025-09-26T04:15:34.475Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/webhook/v1beta1/validation_test.go:699-701
Timestamp: 2025-09-26T04:15:34.475Z
Learning: In fleetconfig-controller/internal/webhook/v1beta1/validation_test.go, the TestValidateAddonUniqueness test has flawed assertion logic at lines 763-769 that uses `if len(err.Error()) >= len(expectedMsg)` for partial matching, which causes false positives when actual error messages are longer than expected messages, allowing tests to pass with completely incorrect expected error messages.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: The waitForAddonManifestWorksCleanup function is called specifically during spoke cleanup/unjoin operations, only when addons were previously enabled, and after addon disable operations. In this controlled context, checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks since other ManifestWorks would have been handled earlier in the cleanup flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: In the fleetconfig-controller addon cleanup flow, waitForAddonManifestWorksCleanup is only called in contexts where non-addon ManifestWorks have already been handled, so checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
🧬 Code graph analysis (7)
fleetconfig-controller/internal/controller/v1alpha1/hub.go (1)
fleetconfig-controller/internal/args/args.go (3)
  • PrepareResources (44-67)
  • PrepareKubeconfig (27-41)
  • SanitizeArgs (82-102)
fleetconfig-controller/internal/controller/v1beta1/hub_controller.go (3)
fleetconfig-controller/api/v1beta1/common.go (2)
  • NewCondition (112-123)
  • Kubeconfig (13-29)
fleetconfig-controller/api/v1beta1/constants.go (3)
  • HubUpgradeFailed (37-37)
  • BundleVersionDefault (215-215)
  • BundleVersionLatest (212-212)
fleetconfig-controller/internal/args/args.go (3)
  • SanitizeArgs (82-102)
  • PrepareResources (44-67)
  • PrepareKubeconfig (27-41)
fleetconfig-controller/internal/args/args_test.go (1)
fleetconfig-controller/internal/args/args.go (2)
  • Redacted (15-15)
  • SanitizeArgs (82-102)
fleetconfig-controller/internal/controller/v1beta1/addon.go (6)
fleetconfig-controller/api/v1beta1/constants.go (7)
  • SupportedHubAddons (205-208)
  • LabelAddOnManagedBy (154-154)
  • FCCAddOnName (116-116)
  • BundleVersionLatest (212-212)
  • AddonArgoCD (198-198)
  • ClusterRoleNameEnvVar (110-110)
  • DefaultFCCManagerRole (119-119)
fleetconfig-controller/internal/args/args.go (1)
  • SanitizeArgs (82-102)
fleetconfig-controller/vendor/open-cluster-management.io/api/addon/v1alpha1/register.go (2)
  • GroupName (10-10)
  • Resource (26-28)
fleetconfig-controller/internal/controller/v1beta1/constants.go (1)
  • AddOnDeploymentConfigResource (30-30)
fleetconfig-controller/vendor/open-cluster-management.io/api/addon/v1alpha1/types_managedclusteraddon.go (1)
  • AddonLabelKey (240-240)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
  • SpokeReconciler (52-58)
fleetconfig-controller/internal/controller/v1alpha1/addon.go (1)
fleetconfig-controller/internal/args/args.go (1)
  • SanitizeArgs (82-102)
fleetconfig-controller/internal/controller/v1alpha1/spoke.go (2)
fleetconfig-controller/internal/args/args.go (4)
  • SanitizeArgs (82-102)
  • PrepareKubeconfig (27-41)
  • Redacted (15-15)
  • PrepareResources (44-67)
fleetconfig-controller/internal/exec/exec.go (1)
  • CmdWithLogs (18-54)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (9)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
  • SpokeReconciler (52-58)
fleetconfig-controller/api/v1beta1/spoke_types.go (6)
  • Spoke (309-323)
  • KlusterletChartConfig (214-216)
  • HubRef (110-118)
  • CleanupConfig (91-107)
  • AddOn (269-281)
  • Klusterlet (141-197)
fleetconfig-controller/api/v1beta1/constants.go (23)
  • InstanceTypeManager (83-83)
  • PivotComplete (31-31)
  • InstanceTypeUnified (89-89)
  • InstanceTypeAgent (86-86)
  • SupportedInstanceTypes (135-139)
  • KlusterletSynced (34-34)
  • SpokeJoined (28-28)
  • LabelManagedClusterType (151-151)
  • ManagedClusterTypeHubAsSpoke (77-77)
  • FCCAddOnName (116-116)
  • ControllerNamespaceEnvVar (107-107)
  • HubNamespaceEnvVar (104-104)
  • SpokeNamespaceEnvVar (101-101)
  • PurgeAgentNamespaceEnvVar (113-113)
  • AddonsConfigured (22-22)
  • HubCleanupPreflightFinalizer (7-7)
  • SpokeCleanupFinalizer (13-13)
  • HubCleanupFinalizer (10-10)
  • OCMSpokeNamespaces (142-146)
  • HubInitialized (19-19)
  • AWSIRSARegistrationDriver (163-163)
  • BundleVersionDefault (215-215)
  • BundleVersionLatest (212-212)
fleetconfig-controller/pkg/common/common.go (6)
  • ClusterClient (20-30)
  • AddOnClient (59-69)
  • GetManagedCluster (72-81)
  • UpdateManagedCluster (84-89)
  • WorkClient (46-56)
  • OperatorClient (33-43)
fleetconfig-controller/internal/kube/kube.go (3)
  • KubeconfigFromSecretOrCluster (113-119)
  • RestConfigFromKubeconfig (22-32)
  • RawFromInClusterRestConfig (75-81)
fleetconfig-controller/internal/hash/hash.go (1)
  • ComputeHash (11-22)
fleetconfig-controller/internal/args/args.go (4)
  • PrepareResources (44-67)
  • PrepareKubeconfig (27-41)
  • SanitizeArgs (82-102)
  • Redacted (15-15)
fleetconfig-controller/internal/exec/exec.go (1)
  • CmdWithLogs (18-54)
fleetconfig-controller/internal/version/version.go (2)
  • LowestBundleVersion (15-46)
  • Normalize (49-55)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (17)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (17)

38-66: LGTM!

The cleanup function correctly handles different instance types and the previous context extraction issue has been resolved.


68-120: LGTM!

The handleSpoke function correctly orchestrates hub and spoke work based on instance type with appropriate error handling.


280-302: LGTM!

The createAgentNamespace function correctly handles namespace creation with proper error checking for AlreadyExists.


304-325: LGTM!

The deleteKubeconfigSecret function correctly handles secret deletion with appropriate guards and error handling.


327-364: LGTM!

The doSpokeWork function correctly handles spoke-side upgrade logic with proper nil guards for ClusterManager.


366-501: LGTM!

The doHubCleanup function implements a comprehensive cleanup sequence with proper error handling and finalizer management.


503-576: LGTM!

The doSpokeCleanup function correctly handles spoke-side cleanup with appropriate logic for different cleanup scenarios.


762-775: LGTM!

The getJoinedCondition function correctly retrieves the joined condition with appropriate nil checks.


777-838: LGTM!

The spokeNeedsUpgrade function implements comprehensive upgrade detection logic with proper handling of special bundleVersion values.


840-883: LGTM!

The upgradeSpoke function correctly handles klusterlet upgrades with proper argument sanitization and logging.


885-915: LGTM!

The unjoinSpoke function correctly handles spoke deregistration with proper argument sanitization.


917-952: LGTM!

The getToken function correctly retrieves the join token and properly redacts sensitive information in logs using arg_utils.Redacted.


954-975: LGTM!

The getHubMeta function correctly retrieves Hub metadata with appropriate kubeconfig loading based on instance type.


977-1045: LGTM!

The mergeKlusterletValues function implements comprehensive value merging logic with proper error handling and clear log messages for missing ConfigMaps or keys.


1047-1066: LGTM!

The prepareKlusterletValuesFile function correctly handles nil and empty values with appropriate cleanup function management.


158-164: We need to inspect Spoke.BaseArgs to confirm whether it passes the hub kubeconfig to clusteradm. I'll fetch its implementation.


588-730: SanitizeArgs correctly redacts the hub token
The sensitiveKeys list includes “hub-token”, so SanitizeArgs(joinArgs) will redact it. Logging the join command’s stdout is safe as it does not include sensitive data.

Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
fleetconfig-controller/internal/args/args.go (1)

103-128: Critical: Inline --flag=value secrets not sanitized.

The current implementation only redacts values when the flag and value are in separate slice elements (e.g., ["--hub-token", "secret"]). Inline forms like --hub-token=secret are NOT sanitized and will be logged in full, potentially exposing tokens and other secrets. This is a critical security gap that undermines the goal of this refactor.

A previous review comment on lines 81-127 correctly identified this issue and provided a fix. Please apply the suggested patch to detect and redact inline --key=value forms by:

  1. Checking if an argument contains =
  2. Splitting on the first = to extract the key
  3. Normalizing the key and checking if it's sensitive
  4. If sensitive, replacing the value after = with Redacted

Apply this diff to address the issue:

 func sanitizeSlice(words []string) []string {
 	if len(words) == 0 {
 		return []string{}
 	}
 
 	// Start with a copy of all args
 	out := make([]string, len(words))
 	copy(out, words)
 
 	// Iterate through and redact values following sensitive keys
 	for i := 0; i < len(words); i++ {
+		// Handle inline --key=value forms
+		if key, _, hasEq := strings.Cut(words[i], "="); hasEq && isSensitiveKey(key) {
+			out[i] = key + "=" + Redacted
+			continue
+		}
+		// Handle space-separated --key value forms
 		if isSensitiveKey(words[i]) {
 			// Check if there's a next element to redact
 			if i+1 < len(words) {
 				out[i+1] = Redacted
 				i++ // Skip the next element since we just redacted it
 			}
 		}
 	}
 	return out
 }

Additionally, add a test case covering inline sensitive arguments to prevent regression:

func TestSanitizeArgs_InlineForm(t *testing.T) {
	args := []string{"--hub-token=secret123", "--other-flag", "value"}
	sanitized := SanitizeArgs(args)
	expected := []string{"--hub-token=REDACTED", "--other-flag", "value"}
	if !reflect.DeepEqual(sanitized, expected) {
		t.Errorf("Expected %v, got %v", expected, sanitized)
	}
}
🧹 Nitpick comments (1)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (1)

885-915: Consider removing commented code.

Line 909 contains a commented condition check for amwExistsError. If this check is no longer needed (because the cleanup logic in doHubCleanup now properly handles addon manifestwork cleanup before unjoin), remove the commented code to improve readability. If it might be needed in the future, add a TODO comment explaining why.

Apply this diff if the check is no longer needed:

 	stdout, stderr, err := exec_utils.CmdWithLogs(ctx, cmd, fmt.Sprintf("waiting for 'clusteradm unjoin' to complete for spoke %s...", spoke.GetName()))
 	out := append(stdout, stderr...)
-	if err != nil { //|| strings.Contains(string(out), amwExistsError) {
+	if err != nil {
 		return fmt.Errorf("failed to unjoin spoke cluster %s: %v, output: %s", spoke.GetName(), err, string(out))
 	}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c0a2e14 and 9714339.

📒 Files selected for processing (3)
  • fleetconfig-controller/internal/args/args.go (2 hunks)
  • fleetconfig-controller/internal/controller/v1alpha1/spoke.go (7 hunks)
  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (1 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.

Applied to files:

  • fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
🧬 Code graph analysis (2)
fleetconfig-controller/internal/controller/v1alpha1/spoke.go (1)
fleetconfig-controller/internal/args/args.go (4)
  • SanitizeArgs (82-84)
  • PrepareKubeconfig (27-41)
  • Redacted (16-16)
  • PrepareResources (44-67)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (9)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
  • SpokeReconciler (52-58)
fleetconfig-controller/api/v1beta1/spoke_types.go (6)
  • Spoke (309-323)
  • KlusterletChartConfig (214-216)
  • HubRef (110-118)
  • CleanupConfig (91-107)
  • AddOn (269-281)
  • Klusterlet (141-197)
fleetconfig-controller/api/v1beta1/constants.go (18)
  • InstanceTypeManager (83-83)
  • PivotComplete (31-31)
  • InstanceTypeUnified (89-89)
  • InstanceTypeAgent (86-86)
  • KlusterletSynced (34-34)
  • SpokeJoined (28-28)
  • LabelManagedClusterType (151-151)
  • ManagedClusterTypeHubAsSpoke (77-77)
  • FCCAddOnName (116-116)
  • ControllerNamespaceEnvVar (107-107)
  • HubCleanupPreflightFinalizer (7-7)
  • SpokeCleanupFinalizer (13-13)
  • HubCleanupFinalizer (10-10)
  • OCMSpokeNamespaces (142-146)
  • HubInitialized (19-19)
  • AWSIRSARegistrationDriver (163-163)
  • BundleVersionDefault (215-215)
  • BundleVersionLatest (212-212)
fleetconfig-controller/pkg/common/common.go (6)
  • ClusterClient (20-30)
  • AddOnClient (59-69)
  • GetManagedCluster (72-81)
  • UpdateManagedCluster (84-89)
  • WorkClient (46-56)
  • OperatorClient (33-43)
fleetconfig-controller/internal/kube/kube.go (3)
  • KubeconfigFromSecretOrCluster (113-119)
  • RestConfigFromKubeconfig (22-32)
  • RawFromInClusterRestConfig (75-81)
fleetconfig-controller/internal/hash/hash.go (1)
  • ComputeHash (11-22)
fleetconfig-controller/internal/args/args.go (3)
  • PrepareResources (44-67)
  • PrepareKubeconfig (27-41)
  • SanitizeArgs (82-84)
fleetconfig-controller/internal/exec/exec.go (1)
  • CmdWithLogs (18-54)
fleetconfig-controller/internal/version/version.go (2)
  • LowestBundleVersion (15-46)
  • Normalize (49-55)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: e2e (fleetconfig-controller) / e2e
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
  • GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
  • GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (30)
fleetconfig-controller/internal/controller/v1alpha1/spoke.go (8)

28-28: LGTM!

The aliased import arg_utils is clear and consistently used throughout the file for sanitization and argument preparation utilities.


243-243: LGTM!

Sanitizing arguments before logging prevents accidental exposure of sensitive data in logs.


281-289: LGTM!

Using PrepareKubeconfig and SanitizeArgs ensures that sensitive kubeconfig data and token arguments are handled securely and not leaked in logs.


297-297: LGTM!

Using arg_utils.Redacted to mask the token output is a critical security improvement that prevents accidental token exposure in logs.


333-333: LGTM!

Centralizing resource argument preparation via PrepareResources improves maintainability and consistency across command invocations.


413-421: LGTM!

Using PrepareKubeconfig and SanitizeArgs ensures that kubeconfig data and join arguments (including tokens) are handled securely and not exposed in logs.


530-538: LGTM!

Using PrepareKubeconfig and SanitizeArgs ensures that kubeconfig data and upgrade arguments are handled securely and not exposed in logs.


590-598: LGTM!

Using PrepareKubeconfig and SanitizeArgs ensures that kubeconfig data and unjoin arguments are handled securely and not exposed in logs.

fleetconfig-controller/internal/args/args.go (4)

7-8: LGTM!

The addition of slices and strings imports is appropriate for the sanitization logic implemented in this file.


15-23: LGTM!

The Redacted constant and sensitiveKeys list provide a clear and maintainable foundation for sanitization logic.


81-84: LGTM!

The SanitizeArgs function provides a clean public API for sanitizing argument slices.


86-101: LGTM!

The SanitizeOutput function provides a clean public API for sanitizing command output by tokenizing, sanitizing, and rejoining text.

fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (18)

38-66: LGTM!

The cleanup routing logic correctly handles all three instance types (Manager, Unified, Agent) with appropriate parameters. The panic for unknown instance types is acceptable since this is guarded at manager initialization time.


68-120: LGTM!

The spoke handling orchestration correctly routes work based on instance type, properly manages klusterlet values merging, and sets appropriate conditions on errors. The hub-as-spoke special case is handled correctly in Manager mode.


227-231: Verify error handling intent for kubeconfig secret deletion.

The code returns an error from deleteKubeconfigSecret, which will cause the reconcile to fail and requeue. If this deletion is critical for the join flow to proceed (e.g., for security compliance in AddOn mode), this behavior is correct. However, if the secret deletion can fail without blocking the join, consider logging the error without returning it.

Based on the PR objectives stating that AddOn mode removes the kubeconfig secret for improved security, this appears to be a critical operation that should block reconciliation if it fails.

Can you confirm whether kubeconfig secret deletion failure should block reconciliation, or if it can be made non-blocking with a warning log?


280-302: LGTM!

The agent namespace creation correctly handles the case where the namespace already exists, and properly uses the environment variable for the namespace name.


304-325: LGTM!

The kubeconfig secret deletion logic correctly implements the security improvement described in the PR objectives, with appropriate guards to ensure it only deletes when in Manager mode with a successful pivot and when the secret should be purged.


327-364: LGTM!

The spoke work handling correctly implements upgrade checking with proper nil guards for ClusterManager, appropriate hash-based change detection, and proper condition management. The nil ClusterManager check at line 345 correctly addresses the past review concern.

Based on learnings


366-501: LGTM!

The hub cleanup logic correctly orchestrates the deregistration sequence with proper checks for manifestWork ownership, addon cleanup, finalizer coordination, and resource removal. The differentiation between hub-as-spoke/non-pivoted cleanup and normal AddOn cleanup is handled appropriately.


503-576: LGTM!

The spoke cleanup logic correctly coordinates with hub cleanup via finalizers, handles the different kubeconfig sources based on pivot status, and performs appropriate additional cleanup in Agent mode when the operator should be purged.


578-586: LGTM!

The helper types are well-defined and appropriately structured for holding token metadata and hub configuration.


732-760: Clarify whether acceptCluster should use explicit hub kubeconfig.

A past review comment suggested passing hub kubeconfig explicitly to acceptCluster to avoid relying on ambient context. However, this function is only called from doHubWork in Manager/Unified mode where the controller should have proper hub access via its default client configuration. The spoke.BaseArgs() method includes context information if needed.

If clusteradm accept uses the ambient kubeconfig (via KUBECONFIG env or ~/.kube/config) and the controller is running in-cluster on the hub, this should work correctly. However, if there are scenarios where the ambient context might be incorrect, explicit kubeconfig passing would be safer.

Can you clarify whether clusteradm accept needs explicit hub kubeconfig configuration, or if the ambient context is guaranteed to be correct in all deployment scenarios?


762-775: LGTM!

The condition retrieval logic is straightforward with appropriate nil guards.


777-838: LGTM!

The upgrade detection logic correctly handles hash-based change detection, special bundle version keywords, and actual version comparison using normalized semver. The approach of using the lowest bundle version from the klusterlet spec is appropriate for detecting if all components are at the desired version.


840-883: LGTM!

The upgrade logic correctly constructs the upgrade command with appropriate flags, handles temporary files properly with deferred cleanup, and sanitizes logs to avoid exposing sensitive information.


917-952: LGTM!

The token retrieval logic correctly addresses the past security concern by not logging the sensitive token payload. The use of arg_utils.Redacted at line 945 ensures that sensitive data is not exposed in logs. The nil check for ClusterManager is appropriate.


954-975: LGTM!

The hub metadata retrieval correctly handles instance-type specific behavior, only loading the hub kubeconfig when needed (not in Agent mode). The error handling appropriately ignores NotFound errors.


977-1045: LGTM!

The klusterlet values merging logic correctly handles all cases (nil values, missing ConfigMap/key, spec-only values, ConfigMap-only values, and merged values) with appropriate warnings and error handling. The corrected log message at line 1003 accurately describes the missing key scenario. The merge strategy appropriately prioritizes spec values over ConfigMap values.

Based on learnings


1047-1066: LGTM!

The temporary values file preparation correctly handles nil and empty values cases, properly manages temporary file creation and cleanup, and provides appropriate error messages.

Based on learnings


588-730: SanitizeArgs correctly redacts the --hub-token value

Confirmed that sensitiveKeys includes “hub-token”, so sanitizeSlice replaces the token argument with REDACTED.

Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice - much better :)

@openshift-ci
Copy link

openshift-ci bot commented Oct 3, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: arturshadnik, TylerGillson

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants