-
Notifications
You must be signed in to change notification settings - Fork 7
refactor: fleetconfig-controller as an addon #69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: fleetconfig-controller as an addon #69
Conversation
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
WalkthroughSplit runtime into Hub and Spoke modes; add Hub/Spoke APIs, Spoke CleanupConfig, new addon CRDs/manifests, manager factories, webhook/validation updates, hub‑spoke controllers and reconciliation flows, Helm/chart and devspace changes, tests, docs, and utility helpers (arg sanitization, ResourceWatcher). Changes
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used🧠 Learnings (4)📚 Learning: 2025-09-12T22:46:57.106ZApplied to files:
📚 Learning: 2025-09-12T22:46:57.106ZApplied to files:
📚 Learning: 2025-09-12T22:46:57.106ZApplied to files:
📚 Learning: 2025-09-12T22:46:57.106ZApplied to files:
🧬 Code graph analysis (1)fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (9)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 13
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (9)
fleetconfig-controller/build/Dockerfile.gke (2)
88-88: Fix:apk del --no-cacheis invalid and will fail the build
apk deldoes not support--no-cache. This RUN will error on Alpine.Apply this diff:
-RUN apk update && apk del --no-cache bash curl +RUN apk update && apk del bash curl
67-78: Pin gcloud download URL to GCLOUD_VERSION and default OS
- Add
ARG OS=linuxand include${GCLOUD_VERSION}in the curl URL, e.g.:ARG OS=linux ARG GCLOUD_VERSION=525.0.0 … && curl -Lo google-cloud-sdk.tar.gz "https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-cli-${GCLOUD_VERSION}-${OS}-${GCLOUD_ARCH}.tar.gz"- Verify the correct ARCH token for arm64 tarballs (
armvsarm64) for this GCLOUD_VERSIONfleetconfig-controller/charts/fleetconfig-controller/crds/0000_01_addon.open-cluster-management.io_managedclusteraddons.crd.yaml (1)
1-406: Remove duplicate addon/v1alpha1 CRDs from chart
hack/install_crds.sh syncs all addon/v1alpha1/*.crd.yaml into fleetconfig-controller/charts/fleetconfig-controller/crds; delete 0000_01_addon.open-cluster-management.io_managedclusteraddons.crd.yaml (and its addon/v1alpha1 siblings) to avoid duplicate Helm installs.fleetconfig-controller/README.md (1)
12-12: Fix spelling of “decentralized”.The new sentence has a typo; please swap in the correct spelling so the onboarding docs read cleanly.
- It is used to enable decentalized management of spoke clusters. + It is used to enable decentralized management of spoke clusters.fleetconfig-controller/internal/webhook/v1beta1/validation_test.go (1)
412-431: Align HubRef update test with intended behavior.The case is now labeled “allowed”, but
wantErris stilltrue, so we continue to assert this change must be rejected. If HubRef updates are meant to be permitted, flip the expectation (and drop the error string); otherwise please revert the rename. Right now the test contradicts itself.- wantErr: true, - errMsg: "spoke contains changes which are not allowed; only changes to spec.klusterlet.annotations, spec.klusterlet.values, spec.kubeconfig, spec.addOns, spec.timeout, and spec.logVerbosity are allowed when updating a spoke", + wantErr: false,fleetconfig-controller/internal/webhook/v1beta1/validation.go (1)
363-366: Guard against missing HubRef to avoid invalid client.Get callsIf HubRef is empty (common in hub-as-spoke), cli.Get with empty name/namespace will error. Return a warning early.
Apply this diff:
// try to get hub, if not present or not ready, log a warning that addons cant be properly validated hub := &v1beta1.Hub{} -err := cli.Get(ctx, types.NamespacedName{Name: newObject.Spec.HubRef.Name, Namespace: newObject.Spec.HubRef.Namespace}, hub) +if newObject.Spec.HubRef.Name == "" || newObject.Spec.HubRef.Namespace == "" { + return admission.Warnings{warnHubNotFound}, nil +} +err := cli.Get(ctx, types.NamespacedName{ + Name: newObject.Spec.HubRef.Name, + Namespace: newObject.Spec.HubRef.Namespace, +}, hub)fleetconfig-controller/internal/controller/v1beta1/addon.go (1)
674-701: Count only addon-owned ManifestWorks to avoid false positives and timeouts.Comparing total count to 0/1 can race with unrelated works. Filter by OwnerReferences.Kind == "ManagedClusterAddOn" when deciding readiness. This reduces flakiness during cleanup.
Apply this diff:
- // for hub-as-spoke, all addons must be removed. - // otherwise, fleetconfig-controller-manager must not be removed. - var expectedWorks = 1 - if isHubAsSpoke { - expectedWorks = 0 - } - - if len(manifestWorks.Items) == expectedWorks { + // for hub-as-spoke, all addon works must be removed. + // otherwise, fleetconfig-controller-manager addon work must remain. + var expectedWorks = 1 + if isHubAsSpoke { + expectedWorks = 0 + } + // count only ManifestWorks owned by ManagedClusterAddOn + owned := 0 + for _, mw := range manifestWorks.Items { + if slices.ContainsFunc(mw.OwnerReferences, func(or metav1.OwnerReference) bool { return or.Kind == managedClusterAddOn }) { + owned++ + } + } + if owned == expectedWorks { logger.V(1).Info("addon manifestWorks cleanup completed", "spokeName", spokeName, "remainingManifestWorks", len(manifestWorks.Items)) return true, nil } - logger.V(3).Info("waiting for addon manifestWorks cleanup", - "spokeName", spokeName, - "addonManifestWorks", len(manifestWorks.Items)) + logger.V(3).Info("waiting for addon manifestWorks cleanup", + "spokeName", spokeName, + "addonManifestWorksOwnedByMCA", owned, + "allManifestWorks", len(manifestWorks.Items))Based on learnings
fleetconfig-controller/test/e2e/helper.go (2)
472-475: Bug: wrong error guard after Create — AlreadyExists should be ignored, not NotFound.Create returns AlreadyExists, so the current check turns a benign condition into a failure.
Apply:
- err = tc.kClient.Create(tc.ctx, cm) - if err != nil && !kerrs.IsNotFound(err) { + err = tc.kClient.Create(tc.ctx, cm) + if err != nil && !kerrs.IsAlreadyExists(err) { utils.WarnError(err, "failed to create configmap") return err }
362-363: Fix WarnError formatting args.An extra err argument is passed into the format string with only two %s verbs.
Use:
- utils.WarnError(err, "failed to fetch namespace %s from cluster '%s'", err, namespaceName, cluster) + utils.WarnError(err, "failed to fetch namespace %s from cluster '%s'", namespaceName, cluster)
🧹 Nitpick comments (19)
fleetconfig-controller/build/Dockerfile.eks (1)
41-49: Optional: add defaults for OS and verify arm64 artifact namingYou rely on
${OS}/${ARCH}for aws-iam-authenticator. Consider defaultingARG OS=linuxand confirm arm64 artifact name islinux_arm64for v0.7.2.fleetconfig-controller/hack/install_crds.sh (1)
29-29: LGTM: include OCM addon CRDs in syncPulling addon/v1alpha1 CRDs complements existing cluster CRDs. Consider surfacing a note in README about re-running this script when bumping OCM versions.
fleetconfig-controller/config/devspace/hub/manager.yaml (1)
19-26: Optional: add ROLE_NAME here too for parityIf the controller reads ROLE_NAME, mirror the Helm deployment by setting it here as well.
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml (1)
165-166: Remove trailing blank lineKeeps linters happy.
Apply this diff:
-fleetconfig-controller/internal/webhook/v1beta1/validation.go (1)
90-92: Align allowed-fields error message with actual allowances (mention spec.kubeconfig)You already allow Kubeconfig changes but the error text omits it. Add for clarity.
Apply this diff:
- return errors.New("only changes to spec.apiServer, spec.clusterManager.source.*, spec.hubAddOns, spec.addOnConfigs, spec.logVerbosity, spec.timeout, and spec.registrationAuth are allowed when updating the hub") + return errors.New("only changes to spec.apiServer, spec.clusterManager.source.*, spec.hubAddOns, spec.addOnConfigs, spec.logVerbosity, spec.timeout, spec.registrationAuth, and spec.kubeconfig are allowed when updating the hub")fleetconfig-controller/config/devspace/spoke/manager.yaml (1)
18-56: Harden container securityContext for dev deploymentsSet runAsNonRoot and disable privilege escalation to address CKV_K8S_20/23. Safe for dev; adjust if base image requires root.
Apply this diff:
containers: - name: fleetconfig-controller-manager + securityContext: + allowPrivilegeEscalation: false + runAsNonRoot: true + capabilities: + drop: ["ALL"] env:fleetconfig-controller/internal/controller/v1beta1/constants.go (1)
32-34: Clarify constant naming: value is a resource name, not a KindAddOnDeploymentConfigsKind holds the plural resource ("addondeploymentconfigs"), which can confuse readers expecting "AddOnDeploymentConfig". Consider a new, clearer constant and keep the old as a compat alias.
Apply this diff:
managedClusterAddOn = "ManagedClusterAddOn" - AddOnDeploymentConfigsKind = "addondeploymentconfigs" + // Deprecated: prefer AddOnDeploymentConfigResource + AddOnDeploymentConfigsKind = "addondeploymentconfigs" + // AddOnDeploymentConfigResource is the plural resource name used in GVRs + AddOnDeploymentConfigResource = "addondeploymentconfigs"fleetconfig-controller/devspace.yaml (1)
58-60: Make kubectl delete idempotent in debug pipelinesAdd --ignore-not-found to avoid pipeline failures when the deployment isn’t present.
Apply this diff:
debug: |- run_dependencies --all build_images fleetconfig-controller-dev - kubectl -n fleetconfig-system delete deployment fleetconfig-controller-manager + kubectl -n fleetconfig-system delete deployment fleetconfig-controller-manager --ignore-not-found create_deployments debug start_dev fleetconfig-controller-dev-hub debug-spoke: |- run_dependencies --all build_images fleetconfig-controller-dev - kubectl -n fleetconfig-system delete deployment fleetconfig-controller-manager + kubectl -n fleetconfig-system delete deployment fleetconfig-controller-manager --ignore-not-found create_deployments debug-spoke start_dev fleetconfig-controller-dev-spokeAlso applies to: 62-67
fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go (3)
133-140: Harden the negative test by asserting the denial reason.The test only expects failure. Assert a substring from the webhook/validation error (e.g., contains "forbidden" or FeatureGates immutable) to avoid false positives from transient errors.
200-235: Optional: factor cleanup checks and use IgnoreNotFound for brevity.The repeated NotFound branches can be simplified with client.IgnoreNotFound and small helpers to reduce noise and ease future additions.
291-291: Remove noisy debug print.Avoid printing raw status to stdout in tests. Use utils.Info with context if needed, or remove entirely.
Apply this diff:
- fmt.Println(hubClone.Status) + // debug print removed to reduce noisy test outputfleetconfig-controller/charts/fleetconfig-controller/crds/0000_03_addon.open-cluster-management.io_addontemplates.crd.yaml (1)
82-83: Nit: fix typos in schema descriptions.Examples:
- "stratecy" → "strategy"
- "elligible"/"afer"/"manfiestwork" → "eligible"/"after"/"manifestwork"
- "resoure" → "resource"
- "kuberenetes" → "kubernetes"
Purely cosmetic, but helps consumers and generated docs.
Also applies to: 119-123, 289-291, 395-396
fleetconfig-controller/cmd/main.go (1)
100-101: Fix misleading log for unknown cluster type.The "controller=Spoke" field is incorrect when the type is unknown.
Apply this diff:
- setupLog.Info("unable to create controller for unknown cluster type", "controller", "Spoke", "clusterType", mOpts.ClusterType, "allowed", apiv1beta1.SupportedClusterTypes) + setupLog.Info("unable to create controller for unknown cluster type", "clusterType", mOpts.ClusterType, "allowed", apiv1beta1.SupportedClusterTypes)fleetconfig-controller/charts/fleetconfig-controller/crds/0000_00_addon.open-cluster-management.io_clustermanagementaddons.crd.yaml (1)
58-66: Nit: fix typos in schema descriptions.Examples:
- "filed" → "field"
- "instaces" → "instances"
Keeps generated docs clean.
Also applies to: 66-67, 120-121
fleetconfig-controller/api/v1beta1/constants.go (2)
6-8: Fix typo in comment.Use “that” instead of “than”.
-// HubCleanupPreflightFinalizer is the finalizer for cleanup preflight checks hub cluster's controller instance. Used to signal to the spoke's controller than unjoin can proceed. +// HubCleanupPreflightFinalizer is the finalizer for cleanup preflight checks by the hub cluster's controller instance. Used to signal to the spoke's controller that unjoin can proceed.
74-107: Correct env-var comments and addon name description.Minor doc fixes to avoid confusion.
-// HubNamespaceEnvVar is the environment variable containing the namespace of the Spoke resource. +// HubNamespaceEnvVar is the environment variable containing the namespace of the Hub resource. -// RoleNameEnvVar containing the name of the ClusterRole for fleetconfig-controller-manager. +// RoleNameEnvVar is the environment variable containing the name of the ClusterRole for fleetconfig-controller-manager. -// FCCAddOnName is the name of the fleetconfig-controller-addon +// FCCAddOnName is the name of the fleetconfig-controller manager add-onfleetconfig-controller/build/Dockerfile.devspace (1)
55-55: Broadened COPY is fine; add .dockerignore to keep image lean and caches stable.Copying the entire cmd/ dir increases build context and can invalidate the build cache frequently. Exclude tests, docs, and other non-build assets via .dockerignore to reduce image size and churn.
fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go (1)
135-136: Make condition assertion resilient to future conditions.assertSpokeConditions enforces exact length equality, making tests brittle as new conditions are added.
Consider ignoring length and only asserting expected types/statuses:
func assertSpokeConditions(conditions []v1beta1.Condition, expected map[string]metav1.ConditionStatus) error { seen := map[string]bool{} for _, c := range conditions { if want, ok := expected[c.Type]; ok { seen[c.Type] = true if c.Status != want { return fmt.Errorf("condition %s has status %s, expected %s", c.Type, c.Status, want) } } } for k := range expected { if !seen[k] { return fmt.Errorf("expected condition %s not found", k) } } return nil }fleetconfig-controller/api/v1beta1/spoke_types.go (1)
101-104: Clarify IsHubAsSpoke semantics; double‑check InCluster implication.Classifying hub-as-spoke purely by InCluster may mislabel cases where a spoke uses in-cluster credentials but is not the hub cluster. Please confirm the intended contract and document it.
If applicable, tighten or clarify the comment:
-// IsHubAsSpoke returns true if the cluster is a hub-as-spoke. Determined either by name `hub-as-spoke` or an InCluster kubeconfig +// IsHubAsSpoke returns true if this Spoke represents the hub cluster itself +// (hub-as-spoke). Determined either by name "hub-as-spoke" or when the Spoke +// is operated in-cluster by the same cluster as the controller process. +// Note: If a Spoke runs in-cluster on a non-hub controller, do not set InCluster. func (s *Spoke) IsHubAsSpoke() bool { return s.Name == ManagedClusterTypeHubAsSpoke || s.Spec.Kubeconfig.InCluster }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (34)
fleetconfig-controller/README.md(1 hunks)fleetconfig-controller/api/v1beta1/constants.go(3 hunks)fleetconfig-controller/api/v1beta1/spoke_types.go(1 hunks)fleetconfig-controller/build/Dockerfile.base(1 hunks)fleetconfig-controller/build/Dockerfile.devspace(1 hunks)fleetconfig-controller/build/Dockerfile.eks(1 hunks)fleetconfig-controller/build/Dockerfile.gke(1 hunks)fleetconfig-controller/charts/fleetconfig-controller/crds/0000_00_addon.open-cluster-management.io_clustermanagementaddons.crd.yaml(1 hunks)fleetconfig-controller/charts/fleetconfig-controller/crds/0000_01_addon.open-cluster-management.io_managedclusteraddons.crd.yaml(1 hunks)fleetconfig-controller/charts/fleetconfig-controller/crds/0000_02_addon.open-cluster-management.io_addondeploymentconfigs.crd.yaml(1 hunks)fleetconfig-controller/charts/fleetconfig-controller/crds/0000_03_addon.open-cluster-management.io_addontemplates.crd.yaml(1 hunks)fleetconfig-controller/charts/fleetconfig-controller/templates/clusterissuer.yaml(1 hunks)fleetconfig-controller/charts/fleetconfig-controller/templates/deployment.yaml(2 hunks)fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml(1 hunks)fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/cluster-management-addon.yaml(1 hunks)fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/cluster-role-binding.yaml(1 hunks)fleetconfig-controller/cmd/main.go(2 hunks)fleetconfig-controller/cmd/manager/manager.go(1 hunks)fleetconfig-controller/config/devspace/hub/manager.yaml(1 hunks)fleetconfig-controller/config/devspace/spoke/manager.yaml(1 hunks)fleetconfig-controller/devspace-start-spoke.sh(1 hunks)fleetconfig-controller/devspace.yaml(7 hunks)fleetconfig-controller/hack/install_crds.sh(1 hunks)fleetconfig-controller/internal/controller/v1beta1/addon.go(8 hunks)fleetconfig-controller/internal/controller/v1beta1/common.go(0 hunks)fleetconfig-controller/internal/controller/v1beta1/constants.go(1 hunks)fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go(8 hunks)fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go(2 hunks)fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go(1 hunks)fleetconfig-controller/internal/webhook/v1beta1/validation.go(2 hunks)fleetconfig-controller/internal/webhook/v1beta1/validation_test.go(1 hunks)fleetconfig-controller/test/data/fleetconfig-values.yaml(1 hunks)fleetconfig-controller/test/e2e/helper.go(2 hunks)fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go(6 hunks)
💤 Files with no reviewable changes (1)
- fleetconfig-controller/internal/controller/v1beta1/common.go
🧰 Additional context used
🧠 Learnings (13)
📓 Common learnings
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#58
File: fleetconfig-controller/charts/fleetconfig-controller/README.md:155-155
Timestamp: 2025-08-27T21:58:32.141Z
Learning: In the open-cluster-management-io/lab repository, the fleetconfig-controller follows a workflow where chart version bumps (in README.md and values.yaml) are included in PRs before the corresponding Docker image exists. The Docker image is built and pushed automatically via GitHub release workflows after the PR is merged and tagged, making the referenced version available.
📚 Learning: 2025-09-22T19:16:34.109Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/webhook/v1beta1/validation.go:103-121
Timestamp: 2025-09-22T19:16:34.109Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller v1beta1 API, the Klusterlet field in SpokeSpec is defined as a struct value (Klusterlet Klusterlet), not a pointer (*Klusterlet), so direct field access like Klusterlet.Annotations is safe without nil checks. The Klusterlet struct does not contain a Source field.
Applied to files:
fleetconfig-controller/test/e2e/helper.gofleetconfig-controller/api/v1beta1/spoke_types.gofleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.gofleetconfig-controller/internal/controller/v1beta1/spoke_controller.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.
Applied to files:
fleetconfig-controller/test/e2e/helper.gofleetconfig-controller/internal/webhook/v1beta1/validation_test.gofleetconfig-controller/internal/webhook/v1beta1/validation.gofleetconfig-controller/devspace.yamlfleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.gofleetconfig-controller/api/v1beta1/constants.gofleetconfig-controller/test/e2e/v1beta1_hub_spoke.gofleetconfig-controller/README.mdfleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-22T18:42:03.404Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/PROJECT:28-31
Timestamp: 2025-09-22T18:42:03.404Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller, the PROJECT file defines multiple API resources with different webhook configurations: FleetConfig v1alpha1 has defaulting: true (requiring MutatingWebhookConfiguration), while Hub and Spoke v1beta1 resources have defaulting: false. MutatingWebhookConfiguration resources in the manifests serve the v1alpha1 FleetConfig, not the v1beta1 Hub/Spoke resources.
Applied to files:
fleetconfig-controller/config/devspace/spoke/manager.yamlfleetconfig-controller/test/data/fleetconfig-values.yamlfleetconfig-controller/devspace.yamlfleetconfig-controller/test/e2e/v1beta1_hub_spoke.gofleetconfig-controller/README.md
📚 Learning: 2025-09-22T19:26:11.020Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/test/data/fleetconfig-v1alpha1.yaml:47-53
Timestamp: 2025-09-22T19:26:11.020Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller tests, the kubeconfigKey is intentionally set to "value" in test fixtures (fleetconfig-v1alpha1.yaml, fleetconfig-values.yaml) because that's how the test harness provisions the kubeconfig secret during test setup. This differs from the chart default of "kubeconfig" but is correct for the test environment.
Applied to files:
fleetconfig-controller/devspace.yaml
📚 Learning: 2025-08-27T21:58:32.141Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#58
File: fleetconfig-controller/charts/fleetconfig-controller/README.md:155-155
Timestamp: 2025-08-27T21:58:32.141Z
Learning: In the open-cluster-management-io/lab repository, the fleetconfig-controller follows a workflow where chart version bumps (in README.md and values.yaml) are included in PRs before the corresponding Docker image exists. The Docker image is built and pushed automatically via GitHub release workflows after the PR is merged and tagged, making the referenced version available.
Applied to files:
fleetconfig-controller/devspace.yamlfleetconfig-controller/config/devspace/hub/manager.yaml
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: The waitForAddonManifestWorksCleanup function is called specifically during spoke cleanup/unjoin operations, only when addons were previously enabled, and after addon disable operations. In this controlled context, checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks since other ManifestWorks would have been handled earlier in the cleanup flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: In the fleetconfig-controller addon cleanup flow, waitForAddonManifestWorksCleanup is only called in contexts where non-addon ManifestWorks have already been handled, so checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-08-22T17:55:52.159Z
Learnt from: TylerGillson
PR: open-cluster-management-io/lab#51
File: fleetconfig-controller/charts/fleetconfig-controller/README.md:155-155
Timestamp: 2025-08-22T17:55:52.159Z
Learning: In the open-cluster-management-io/lab repository, chart versioning for fleetconfig-controller is handled automatically via GitHub release workflows, not through manual version bumps in Chart.yaml during regular PRs.
Applied to files:
fleetconfig-controller/charts/fleetconfig-controller/templates/deployment.yaml
🧬 Code graph analysis (9)
fleetconfig-controller/api/v1beta1/spoke_types.go (3)
fleetconfig-controller/api/v1alpha1/fleetconfig_types.go (2)
Spoke(350-392)Kubeconfig(301-317)fleetconfig-controller/api/v1beta1/constants.go (1)
ManagedClusterTypeHubAsSpoke(71-71)fleetconfig-controller/api/v1beta1/common.go (1)
Kubeconfig(13-29)
fleetconfig-controller/internal/webhook/v1beta1/validation.go (1)
fleetconfig-controller/api/v1beta1/spoke_types.go (1)
AddOn(241-253)
fleetconfig-controller/cmd/main.go (2)
fleetconfig-controller/cmd/manager/manager.go (3)
Options(34-46)ForHub(49-143)ForSpoke(146-264)fleetconfig-controller/api/v1beta1/constants.go (4)
SpokeDefaultMaxConcurrentReconciles(148-148)ClusterTypeHub(76-76)ClusterTypeSpoke(79-79)SupportedClusterTypes(110-113)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go (1)
fleetconfig-controller/api/v1beta1/constants.go (2)
ClusterTypeHub(76-76)PivotComplete(31-31)
fleetconfig-controller/internal/controller/v1beta1/addon.go (4)
fleetconfig-controller/api/v1beta1/spoke_types.go (1)
AddOn(241-253)fleetconfig-controller/api/v1beta1/constants.go (1)
FCCAddOnName(103-103)fleetconfig-controller/vendor/open-cluster-management.io/api/addon/v1alpha1/register.go (2)
GroupName(10-10)Resource(26-28)fleetconfig-controller/internal/controller/v1beta1/constants.go (1)
AddOnDeploymentConfigsKind(33-33)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (2)
fleetconfig-controller/api/v1beta1/constants.go (8)
ClusterTypeHub(76-76)HubCleanupFinalizer(10-10)HubCleanupPreflightFinalizer(7-7)SpokeCleanupFinalizer(13-13)ClusterTypeSpoke(79-79)SupportedClusterTypes(110-113)PivotComplete(31-31)SpokeNameEnvVar(88-88)fleetconfig-controller/cmd/manager/manager.go (1)
Options(34-46)
fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go (2)
fleetconfig-controller/test/utils/utils.go (3)
DevspaceRunPipeline(78-97)Info(66-69)WarnError(72-75)fleetconfig-controller/api/v1beta1/constants.go (1)
Deleting(57-57)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (9)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
SpokeReconciler(46-52)fleetconfig-controller/api/v1beta1/spoke_types.go (5)
Spoke(281-295)KlusterletChartConfig(186-188)HubRef(86-94)AddOn(241-253)Klusterlet(107-169)fleetconfig-controller/api/v1beta1/constants.go (19)
ClusterTypeHub(76-76)ClusterTypeSpoke(79-79)SupportedClusterTypes(110-113)SpokeJoined(28-28)LabelManagedClusterType(118-118)ManagedClusterTypeHubAsSpoke(71-71)ControllerNamespaceEnvVar(97-97)HubNamespaceEnvVar(94-94)SpokeNamespaceEnvVar(91-91)AddonsConfigured(22-22)RoleNameEnvVar(100-100)DefaultFCCManagerRole(106-106)FCCAddOnName(103-103)PivotComplete(31-31)HubCleanupPreflightFinalizer(7-7)SpokeCleanupFinalizer(13-13)HubCleanupFinalizer(10-10)HubInitialized(19-19)AWSIRSARegistrationDriver(130-130)fleetconfig-controller/pkg/common/common.go (6)
ClusterClient(20-30)AddOnClient(59-69)GetManagedCluster(72-81)UpdateManagedCluster(84-89)WorkClient(46-56)OperatorClient(33-43)fleetconfig-controller/internal/kube/kube.go (3)
KubeconfigFromSecretOrCluster(113-119)RawFromInClusterRestConfig(75-81)RestConfigFromKubeconfig(22-32)fleetconfig-controller/internal/hash/hash.go (1)
ComputeHash(11-22)fleetconfig-controller/internal/args/args.go (2)
PrepareResources(32-55)PrepareKubeconfig(15-29)fleetconfig-controller/internal/exec/exec.go (1)
CmdWithLogs(18-54)fleetconfig-controller/internal/version/version.go (2)
LowestBundleVersion(15-46)Normalize(49-55)
fleetconfig-controller/cmd/manager/manager.go (8)
fleetconfig-controller/internal/controller/v1alpha1/fleetconfig_controller.go (1)
FleetConfigReconciler(58-62)fleetconfig-controller/internal/controller/v1beta1/hub_controller.go (1)
HubReconciler(55-59)fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
SpokeReconciler(46-52)fleetconfig-controller/api/v1beta1/constants.go (5)
ClusterTypeSpoke(79-79)SpokeNamespaceEnvVar(91-91)HubNamespaceEnvVar(94-94)HubKubeconfigEnvVar(82-82)DefaultHubKubeconfigPath(85-85)fleetconfig-controller/api/v1alpha1/fleetconfig_webhook.go (1)
SetupFleetConfigWebhookWithManager(38-43)fleetconfig-controller/internal/webhook/v1beta1/hub_webhook.go (1)
SetupHubWebhookWithManager(43-55)fleetconfig-controller/internal/webhook/v1beta1/spoke_webhook.go (1)
SetupSpokeWebhookWithManager(44-56)fleetconfig-controller/internal/kube/kube.go (1)
RestConfigFromKubeconfig(22-32)
🪛 Checkov (3.2.334)
fleetconfig-controller/config/devspace/spoke/manager.yaml
[medium] 1-56: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 1-56: Minimize the admission of root containers
(CKV_K8S_23)
🪛 YAMLlint (1.37.1)
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml
[warning] 165-165: too many blank lines (1 > 0)
(empty-lines)
[error] 19-19: syntax error: expected , but found ''
(syntax)
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/cluster-role-binding.yaml
[error] 9-9: syntax error: expected , but found ''
(syntax)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (32)
fleetconfig-controller/build/Dockerfile.gke (2)
80-81: Confirm gcloud’s bash dependency post-installYou remove bash from the final image. Some gcloud wrapper scripts are bash-based. Since you only need the gke-gcloud-auth-plugin, this may be fine, but please confirm no code invokes
gcloudat runtime.
23-30: LGTM on copying full cmd directoryBroader COPY context is appropriate given the main refactor; the build target remains cmd/main.go.
fleetconfig-controller/build/Dockerfile.eks (1)
24-31: LGTM: broadened COPY context is consistent with the refactorCopying the entire cmd/ dir aligns with multi-entrypoint patterns and keeps the build target unchanged.
fleetconfig-controller/charts/fleetconfig-controller/templates/clusterissuer.yaml (1)
2-2: LGTM: non-functional YAML language server hintThe added comment improves editor UX and doesn’t affect rendered manifests.
fleetconfig-controller/charts/fleetconfig-controller/templates/deployment.yaml (2)
47-53: Verify ROLE_NAME aligns with rendered RBAC object namesEnsure a Role/ClusterRole named
{{ include "chart.fullname" . }}-manager-roleexists; otherwise this env var is misleading or unused.
71-81: LGTM: gating TLS mounts/volumes on webhooksMounts and secret volumes are correctly included only when webhooks are enabled.
fleetconfig-controller/test/data/fleetconfig-values.yaml (1)
41-43: LGTM: add the fleetconfig-controller-manager addonMatches the new addon wiring introduced in the PR. Ensure AddOnTemplate/ClusterManagementAddOn resources use the same name for consistency.
fleetconfig-controller/config/devspace/hub/manager.yaml (1)
26-30: LGTM: add CONTROLLER_NAMESPACE for parity with chartKeeps devspace config aligned with the Helm deployment.
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/cluster-role-binding.yaml (1)
11-13: Verify addon-manager service account exists in the hub namespaceClusterRoleBinding targets ServiceAccount addon-manager-controller-sa in open-cluster-management-hub. Ensure it is created by OCM on the hub or provision it in this chart.
If needed, I can add a ServiceAccount manifest to this chart or reference the upstream one. Do you want me to draft it?
fleetconfig-controller/charts/fleetconfig-controller/crds/0000_02_addon.open-cluster-management.io_addondeploymentconfigs.crd.yaml (1)
1-280: LGTM: CRD schema looks consistent with OCM v1alpha1 AddOnDeploymentConfigFields, list-types, and patterns align with expectations.
fleetconfig-controller/devspace.yaml (1)
163-188: Nice split between hub and spoke dev targetsClear terminal commands and selective syncs reduce churn and speed up inner loop.
fleetconfig-controller/test/e2e/v1beta1_hub_spoke.go (2)
27-27: LGTM: apps/v1 import used correctly for addon agent checks.
64-69: LGTM: devspace pipelines sequence reads well (load to spoke, deploy on hub).fleetconfig-controller/api/v1beta1/constants.go (3)
12-13: LGTM: clarified Spoke finalizer comment.
27-31: LGTM: new PivotComplete condition.
109-114: LGTM: SupportedClusterTypes exported list.fleetconfig-controller/cmd/manager/manager.go (2)
61-65: LGTM: HTTP/2 hardening via TLS NextProtos.
198-229: Confirm cache scoping matches RBAC of the add-on.The ByObject cache scopes only Spoke (spokeNamespace), ConfigMap (spokeNamespace), and Hub (hubNamespace). Ensure the add-on’s permissions match these watches; otherwise reconciles will stall.
If helpful, I can scan the Helm templates/ClusterRole for matching list/watch verbs in those namespaces.
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (2)
94-107: LGTM: finalizer choreography per cluster type.Hub adds preflight + hub cleanup (and spoke cleanup for hub-as-spoke); Spoke adds spoke cleanup.
Also applies to: 109-116
151-153: LGTM: initialize PivotComplete with WantStatus=true.fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (8)
62-87: Nil klusterlet values are intentionally supported.Downstream paths handle nil (values file omitted; hash uses ZeroNil). No changes needed.
Based on learnings
182-211: Ensure controller namespace is set for AddOnDeploymentConfig.If CONTROLLER_NAMESPACE is empty, the agent install namespace becomes empty. Consider validating and surfacing a clear error.
I can add a guard here similar to the manager env var checks if you want.
316-344: LGTM: safety checks before unjoin (manifestWorks and add-ons).
387-399: LGTM: CSR cleanup filtered by cluster-name label and suffix trimming.
446-508: LGTM: spoke-side unjoin and self-cleanup, with NotFound-safe deletes.
949-955: Confirm mergo semantics match intent.With mergo.WithOverride, only non-empty values from specInterface override fromInterface, which aligns with “ignore zero-values from spec”.
If the intent is to also allow empty values to clear defaults, switch to WithOverwriteWithEmptyValue (if available in the used mergo version).
972-991: LGTM: values file emission is conditional and cleaned up.
855-857: Nil check on HubSpec.ClusterManager is valid
HubSpec.ClusterManager is defined as a pointer in v1beta1 ( *ClusterManager ); the nil check compiles and is required. No change needed.Likely an incorrect or invalid review comment.
fleetconfig-controller/build/Dockerfile.base (1)
25-25: LGTM: copying cmd/ aligns with the build target.No functional concerns; the build still targets cmd/main.go.
fleetconfig-controller/test/e2e/helper.go (2)
490-491: Confirm PivotComplete applies to all spokes.You require PivotComplete=True for every spoke. Verify non hub-as-spoke clusters also set this condition; otherwise gate this expectation per spoke type.
57-58: No change needed:v1beta1fccAddOnAgentNNis used infleetconfig-controller/test/e2e/v1beta1_hub_spoke.go(line 202).Likely an incorrect or invalid review comment.
fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go (1)
74-78: LGTM: initializes ClusterType=hub for the hub-as-spoke path.This matches the new reconciler field and test intent.
fleetconfig-controller/charts/fleetconfig-controller/templates/deployment.yaml
Show resolved
Hide resolved
...tconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml
Outdated
Show resolved
Hide resolved
...tconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml
Outdated
Show resolved
Hide resolved
...tconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml
Show resolved
Hide resolved
...ntroller/charts/fleetconfig-controller/templates/ocm/fcc-addon/cluster-management-addon.yaml
Show resolved
Hide resolved
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
fleetconfig-controller/charts/fleetconfig-controller/templates/deployment.yaml (1)
60-68: Keep theports:key when webhooks are disabledWhen
admissionWebhooks.enabledis false, the template renders- containerPort: …without a precedingports:key, producing invalid YAML. Always emit the list key and add the webhook port conditionally.- {{- if .Values.admissionWebhooks.enabled }} - ports: - - containerPort: {{ .Values.webhookService.port }} - name: webhook-server - protocol: TCP - {{- end }} + ports: + {{- if .Values.admissionWebhooks.enabled }} + - containerPort: {{ .Values.webhookService.port }} + name: webhook-server + protocol: TCP + {{- end }} - containerPort: {{ .Values.healthCheck.port }} name: healthz protocol: TCPfleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml (1)
164-165: Remove trailing blank line.YAMLlint warns about extra blank line at EOF.
Apply this diff:
- clusterRoleName: {{ include "chart.fullname" . }}-manager-role - + clusterRoleName: {{ printf "%s-manager-role" (include "chart.fullname" .) | quote }}As per static analysis hints
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
fleetconfig-controller/charts/fleetconfig-controller/templates/deployment.yaml(2 hunks)fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml(1 hunks)fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/cluster-management-addon.yaml(1 hunks)fleetconfig-controller/cmd/main.go(2 hunks)fleetconfig-controller/cmd/manager/manager.go(1 hunks)fleetconfig-controller/devspace-start-spoke.sh(1 hunks)fleetconfig-controller/internal/controller/v1beta1/addon.go(8 hunks)fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go(8 hunks)fleetconfig-controller/internal/webhook/v1beta1/validation.go(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/cluster-management-addon.yaml
- fleetconfig-controller/internal/controller/v1beta1/addon.go
- fleetconfig-controller/devspace-start-spoke.sh
- fleetconfig-controller/cmd/main.go
🧰 Additional context used
🧠 Learnings (6)
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.
Applied to files:
fleetconfig-controller/internal/webhook/v1beta1/validation.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
📚 Learning: 2025-09-22T19:16:34.109Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/webhook/v1beta1/validation.go:103-121
Timestamp: 2025-09-22T19:16:34.109Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller v1beta1 API, the Klusterlet field in SpokeSpec is defined as a struct value (Klusterlet Klusterlet), not a pointer (*Klusterlet), so direct field access like Klusterlet.Annotations is safe without nil checks. The Klusterlet struct does not contain a Source field.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
🧬 Code graph analysis (2)
fleetconfig-controller/internal/webhook/v1beta1/validation.go (2)
fleetconfig-controller/api/v1beta1/spoke_types.go (1)
AddOn(241-253)fleetconfig-controller/api/v1beta1/constants.go (1)
FCCAddOnName(103-103)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (2)
fleetconfig-controller/api/v1beta1/constants.go (8)
ClusterTypeHub(76-76)HubCleanupFinalizer(10-10)HubCleanupPreflightFinalizer(7-7)SpokeCleanupFinalizer(13-13)ClusterTypeSpoke(79-79)SupportedClusterTypes(110-113)PivotComplete(31-31)SpokeNameEnvVar(88-88)fleetconfig-controller/cmd/manager/manager.go (1)
Options(33-45)
🪛 YAMLlint (1.37.1)
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml
[warning] 164-164: too many blank lines (1 > 0)
(empty-lines)
[error] 19-19: syntax error: expected , but found ''
(syntax)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
- GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (6)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
249-257: Guard emptyCLUSTER_NAMEbefore scoping the controllerIf
CLUSTER_NAMEis unset, the predicate matches nothing and the controller never reconciles. Validate it up front so startup fails with a clear error instead of silently disabling the controller.-func (r *SpokeReconciler) SetupWithManagerForSpoke(mgr ctrl.Manager) error { - spokeName := os.Getenv(v1beta1.SpokeNameEnvVar) // we know this is set, because the mgr setup would have failed otherwise +func (r *SpokeReconciler) SetupWithManagerForSpoke(mgr ctrl.Manager) error { + spokeName := strings.TrimSpace(os.Getenv(v1beta1.SpokeNameEnvVar)) + if spokeName == "" { + return fmt.Errorf("env %s must be set to the Spoke name", v1beta1.SpokeNameEnvVar) + } return ctrl.NewControllerManagedBy(mgr). For(&v1beta1.Spoke{}, builder.WithPredicates(predicate.NewPredicateFuncs((Add
stringsto the imports.)fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml (5)
84-88: Good fix: removed namespace from ClusterRole metadata.Cluster-scoped resource no longer sets metadata.namespace.
149-153: Good fix: removed invalid namespace from RoleBinding.roleRef.roleRef now contains only apiGroup, kind, name.
19-19: Quote Helm-composed names to fix YAML parse error (YAMLlint).Unquoted concatenations trigger a parse error at Line 19 and risk elsewhere. Quote via printf + quote.
Apply this diff:
- name: {{ include "chart.fullname" . }}-manager + name: {{ printf "%s-manager" (include "chart.fullname" .) | quote }} @@ - name: {{ include "chart.fullname" . }}-manager-spoke-rolebinding + name: {{ printf "%s-manager-spoke-rolebinding" (include "chart.fullname" .) | quote }} @@ - name: {{ include "chart.fullname" . }}-manager-spoke-role + name: {{ printf "%s-manager-spoke-role" (include "chart.fullname" .) | quote }} @@ - name: {{ include "chart.fullname" . }}-leader-election-role + name: {{ printf "%s-leader-election-role" (include "chart.fullname" .) | quote }} @@ - name: {{ include "chart.fullname" . }}-leader-election-rolebinding + name: {{ printf "%s-leader-election-rolebinding" (include "chart.fullname" .) | quote }} @@ - name: {{ include "chart.fullname" . }}-leader-election-role + name: {{ printf "%s-leader-election-role" (include "chart.fullname" .) | quote }} @@ - clusterRoleName: {{ include "chart.fullname" . }}-manager-role + clusterRoleName: {{ printf "%s-manager-role" (include "chart.fullname" .) | quote }}As per static analysis hints
Also applies to: 116-116, 120-120, 128-128, 145-145, 152-152, 164-164
163-164: No missing ClusterRole: The templated ClusterRole{{ include "chart.fullname" . }}-manager-roleis defined in charts/fleetconfig-controller/templates/rbac/manager-rbac.yaml.
89-92: Confirm necessity of namespace delete permission. Namespace deletion is required by cleanup logic in internal/controller/v1alpha1/hub.go and internal/controller/v1beta1/spoke_handler.go and is gated by the CR’s purge flags—permission should remain.
...tconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml
Show resolved
Hide resolved
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
fleetconfig-controller/internal/webhook/v1beta1/validation_test.go (1)
412-431: Fix test name vs expectation mismatch for HubRef change.The case is named “allowed - HubRef change” but asserts wantErr=true with a disallowed-fields message. Either:
- Rename to “disallowed - HubRef change”, or
- Allow HubRef changes in allowSpokeUpdate and set wantErr=false.
Given current code, rename is the minimal fix.
- name: "allowed - HubRef change", + name: "disallowed - HubRef change", ... - wantErr: true, - errMsg: "spoke contains changes which are not allowed; only changes to spec.klusterlet.annotations, spec.klusterlet.values, spec.kubeconfig, spec.addOns, spec.timeout, and spec.logVerbosity are allowed when updating a spoke", + wantErr: true, + errMsg: "spoke contains changes which are not allowed; only changes to spec.klusterlet.annotations, spec.klusterlet.values, spec.kubeconfig, spec.addOns, spec.timeout, and spec.logVerbosity are allowed when updating a spoke",
🧹 Nitpick comments (5)
fleetconfig-controller/internal/webhook/v1beta1/validation.go (1)
362-376: Optionally short-circuit on existing hard errors to save API calls.If errs already contains Invalids from addon presence checks, you can return early instead of calling the API for hub lookup.
errs := field.ErrorList{} if newObject.IsHubAsSpoke() { ... } + if len(errs) > 0 { + return nil, errs + } + // try to get hub, if not present or not ready, log a warning that addons cant be properly validatedfleetconfig-controller/devspace.yaml (2)
163-188: Minor: consider numeric port vars for dev terminal mappings.DevSpace treats ports as numbers; defining HUB_PORT as an int avoids accidental quoting issues across tooling.
- HUB_PORT: - value: "2344" + HUB_PORT: + value: 2344Repeat similarly for SPOKE_PORT.
195-198: Same nit: SPOKE_PORT as number.See prior suggestion.
- SPOKE_PORT: - value: "2345" + SPOKE_PORT: + value: 2345fleetconfig-controller/internal/controller/v1beta1/addon.go (2)
387-396: Handle MCA not-yet-created and ensure idempotent patching.Right after enable, MCA may not exist; current code returns an error. Add a short poll for NotFound and avoid duplicating the config.
- if a.ConfigName == v1beta1.FCCAddOnName { - err = patchFCCMca(ctx, spoke.Name, addonC) - if err != nil { - enableErrs = append(enableErrs, err) - continue - } - } + if a.ConfigName == v1beta1.FCCAddOnName { + if err := patchFCCMca(ctx, spoke.Name, addonC); err != nil { + enableErrs = append(enableErrs, err) + continue + } + }And update patchFCCMca as below (see next comment).
676-699: Verify expected ManifestWorks count logic during cleanup.Switching to expectedWorks=1 for non hub-as-spoke assumes FCC manager MW remains during unjoin. Previous flow (per learnings) treated zero total MW as equivalent to zero addon MW in this context. Confirm this invariant change is intended; otherwise we may time out or under-clean.
Consider filtering to addon-owned ManifestWorks (e.g., by ownerRef kind ManagedClusterAddOn) or making expected count an argument derived from the caller’s intent.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
fleetconfig-controller/api/v1beta1/constants.go(3 hunks)fleetconfig-controller/charts/fleetconfig-controller/templates/deployment.yaml(1 hunks)fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml(1 hunks)fleetconfig-controller/cmd/main.go(2 hunks)fleetconfig-controller/cmd/manager/manager.go(1 hunks)fleetconfig-controller/config/devspace/hub/manager.yaml(1 hunks)fleetconfig-controller/devspace.yaml(7 hunks)fleetconfig-controller/internal/controller/v1beta1/addon.go(8 hunks)fleetconfig-controller/internal/controller/v1beta1/constants.go(1 hunks)fleetconfig-controller/internal/webhook/v1beta1/validation.go(3 hunks)fleetconfig-controller/internal/webhook/v1beta1/validation_test.go(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- fleetconfig-controller/charts/fleetconfig-controller/templates/deployment.yaml
- fleetconfig-controller/cmd/manager/manager.go
🧰 Additional context used
🧠 Learnings (8)
📚 Learning: 2025-08-27T21:58:32.141Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#58
File: fleetconfig-controller/charts/fleetconfig-controller/README.md:155-155
Timestamp: 2025-08-27T21:58:32.141Z
Learning: In the open-cluster-management-io/lab repository, the fleetconfig-controller follows a workflow where chart version bumps (in README.md and values.yaml) are included in PRs before the corresponding Docker image exists. The Docker image is built and pushed automatically via GitHub release workflows after the PR is merged and tagged, making the referenced version available.
Applied to files:
fleetconfig-controller/config/devspace/hub/manager.yamlfleetconfig-controller/devspace.yaml
📚 Learning: 2025-09-25T23:18:41.545Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.545Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.
Applied to files:
fleetconfig-controller/api/v1beta1/constants.gofleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.
Applied to files:
fleetconfig-controller/api/v1beta1/constants.gofleetconfig-controller/internal/webhook/v1beta1/validation.gofleetconfig-controller/internal/webhook/v1beta1/validation_test.go
📚 Learning: 2025-09-22T18:42:03.404Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/PROJECT:28-31
Timestamp: 2025-09-22T18:42:03.404Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller, the PROJECT file defines multiple API resources with different webhook configurations: FleetConfig v1alpha1 has defaulting: true (requiring MutatingWebhookConfiguration), while Hub and Spoke v1beta1 resources have defaulting: false. MutatingWebhookConfiguration resources in the manifests serve the v1alpha1 FleetConfig, not the v1beta1 Hub/Spoke resources.
Applied to files:
fleetconfig-controller/devspace.yaml
📚 Learning: 2025-09-22T19:26:11.020Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/test/data/fleetconfig-v1alpha1.yaml:47-53
Timestamp: 2025-09-22T19:26:11.020Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller tests, the kubeconfigKey is intentionally set to "value" in test fixtures (fleetconfig-v1alpha1.yaml, fleetconfig-values.yaml) because that's how the test harness provisions the kubeconfig secret during test setup. This differs from the chart default of "kubeconfig" but is correct for the test environment.
Applied to files:
fleetconfig-controller/devspace.yaml
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: The waitForAddonManifestWorksCleanup function is called specifically during spoke cleanup/unjoin operations, only when addons were previously enabled, and after addon disable operations. In this controlled context, checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks since other ManifestWorks would have been handled earlier in the cleanup flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: In the fleetconfig-controller addon cleanup flow, waitForAddonManifestWorksCleanup is only called in contexts where non-addon ManifestWorks have already been handled, so checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-25T23:31:11.561Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml:110-112
Timestamp: 2025-09-25T23:31:11.561Z
Learning: The fleetconfig-controller-manager spoke agent requires create/update/patch/delete permissions on CustomResourceDefinitions because `clusteradm upgrade klusterlet` operations need create/update permissions and cleanup operations require delete permissions for proper lifecycle management.
Applied to files:
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml
🧬 Code graph analysis (3)
fleetconfig-controller/internal/webhook/v1beta1/validation.go (3)
fleetconfig-controller/api/v1beta1/spoke_types.go (2)
AddOn(241-253)HubRef(86-94)fleetconfig-controller/api/v1alpha1/fleetconfig_types.go (2)
AddOn(430-442)Hub(176-208)fleetconfig-controller/api/v1beta1/constants.go (2)
FCCAddOnName(103-103)HubInitialized(19-19)
fleetconfig-controller/cmd/main.go (2)
fleetconfig-controller/cmd/manager/manager.go (3)
Options(33-45)ForHub(73-146)ForSpoke(149-257)fleetconfig-controller/api/v1beta1/constants.go (4)
SpokeDefaultMaxConcurrentReconciles(148-148)ClusterTypeHub(76-76)ClusterTypeSpoke(79-79)SupportedClusterTypes(110-113)
fleetconfig-controller/internal/controller/v1beta1/addon.go (3)
fleetconfig-controller/api/v1beta1/constants.go (1)
FCCAddOnName(103-103)fleetconfig-controller/vendor/open-cluster-management.io/api/addon/v1alpha1/register.go (2)
GroupName(10-10)Resource(26-28)fleetconfig-controller/internal/controller/v1beta1/constants.go (1)
AddOnDeploymentConfigResource(33-33)
🪛 YAMLlint (1.37.1)
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml
[error] 19-19: syntax error: expected , but found ''
(syntax)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
- GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (21)
fleetconfig-controller/config/devspace/hub/manager.yaml (1)
26-32: Env wiring looks good; matches new constants.CONTROLLER_NAMESPACE from fieldRef and static ROLE_NAME align with v1beta1 constants. No issues.
fleetconfig-controller/internal/webhook/v1beta1/validation_test.go (3)
221-223: Updated hub allowed-fields message is consistent with implementation.Includes spec.kubeconfig; matches validation.go.
236-238: Updated hub allowed-fields message is consistent with implementation.Includes spec.kubeconfig; matches validation.go.
255-257: Updated hub allowed-fields message is consistent with implementation.Includes spec.kubeconfig; matches validation.go.
fleetconfig-controller/internal/webhook/v1beta1/validation.go (4)
24-26: Good: removed local FCC addon name; using API constant.Avoids string drift.
89-91: Allowed-fields error updated to include kubeconfig.Matches tests.
348-360: Add-on presence guard based on IsHubAsSpoke is clear and correct.Checks both allow/deny cases using the shared constant.
378-382: Good: internal errors during hub/CMA listing are returned, not downgraded to warnings.Prevents invalid resources from slipping through.
fleetconfig-controller/devspace.yaml (7)
12-15: Ports vars added (HUB_PORT/SPOKE_PORT).Looks fine.
42-43: Start hub dev target in default dev pipeline.Good separation vs spoke.
58-61: Debug pipeline starts hub dev target.Consistent with manifest path switch to hub.
62-67: Spoke debug path added.Assumes hub exists; that’s fine for targeted spoke debugging.
Please confirm this pipeline is intended to run with a pre-existing hub (or via a separate terminal running the hub).
114-116: Wires fleetConfig.enabled flag for dev.Aligns with profiles; no issues.
137-139: Debug manifests switched to hub.Matches pipeline intent.
141-144: Spoke debug manifests path correct.No issues.
fleetconfig-controller/cmd/main.go (2)
66-69: Sane webhook defaults set (cert dir and port).Prevents ephemeral port issues; good.
86-102: Cluster-type switch delegating to ForHub/ForSpoke looks solid.Clear failure for unknown types and proper exit behavior.
fleetconfig-controller/api/v1beta1/constants.go (2)
74-107: New cluster-type and env constants look good.Names/readability align across code and charts.
109-114: SupportedClusterTypes matches usage in main.go.Good centralization.
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml (1)
19-19: Quote concatenated Helm names to fix YAML parse error and ensure correctness.Unquoted concatenations like
name: {{ include "chart.fullname" . }}-managercause YAMLlint syntax errors and can break templating. Quote them consistently (also avoids accidental number parsing).- name: {{ include "chart.fullname" . }}-manager + name: {{ printf "%s-manager" (include "chart.fullname" .) | quote }} @@ - name: {{ include "chart.fullname" . }}-manager-spoke-rolebinding + name: {{ printf "%s-manager-spoke-rolebinding" (include "chart.fullname" .) | quote }} @@ - name: {{ include "chart.fullname" . }}-manager-spoke-role + name: {{ printf "%s-manager-spoke-role" (include "chart.fullname" .) | quote }} @@ - name: {{ include "chart.fullname" . }}-leader-election-role + name: {{ printf "%s-leader-election-role" (include "chart.fullname" .) | quote }} @@ - name: {{ include "chart.fullname" . }}-leader-election-rolebinding + name: {{ printf "%s-leader-election-rolebinding" (include "chart.fullname" .) | quote }} @@ - clusterRoleName: {{ include "chart.fullname" . }}-manager-role + clusterRoleName: {{ printf "%s-manager-role" (include "chart.fullname" .) | quote }} @@ - name: {{ include "chart.fullname" . }}-leader-election-role + name: {{ printf "%s-leader-election-role" (include "chart.fullname" .) | quote }}Also applies to: 116-116, 120-121, 128-129, 145-146, 156-157, 163-164
fleetconfig-controller/internal/controller/v1beta1/addon.go (1)
408-448: Bug: wrong field access in AddOnConfig comparisons; and no tolerance for MCA creation delay.Comparing c.Group/Resource/Name/Namespace is invalid; fields are nested. Also, returning on Get error (e.g., NotFound) races with addon enable.
Apply this diff to fix field access and add a brief poll for MCA readiness plus idempotent config check:
-func patchFCCMca(ctx context.Context, spokeName string, addonC *addonapi.Clientset) error { - mca, err := addonC.AddonV1alpha1().ManagedClusterAddOns(spokeName).Get(ctx, v1beta1.FCCAddOnName, metav1.GetOptions{}) - if err != nil { - return fmt.Errorf("failed to configure fleetconfig-controller-manager: %v", err) - } +func patchFCCMca(ctx context.Context, spokeName string, addonC *addonapi.Clientset) error { + // wait briefly for MCA to appear after enable + var mca *addonv1alpha1.ManagedClusterAddOn + if err := wait.PollUntilContextTimeout(ctx, 1*time.Second, 30*time.Second, true, func(ctx context.Context) (bool, error) { + obj, err := addonC.AddonV1alpha1().ManagedClusterAddOns(spokeName).Get(ctx, v1beta1.FCCAddOnName, metav1.GetOptions{}) + if err != nil { + if kerrs.IsNotFound(err) { + return false, nil + } + return false, err + } + mca = obj + return true, nil + }); err != nil { + return fmt.Errorf("failed to locate ManagedClusterAddOn %q in %q: %v", v1beta1.FCCAddOnName, spokeName, err) + } desired := addonv1alpha1.AddOnConfig{ ConfigGroupResource: addonv1alpha1.ConfigGroupResource{ Group: addonv1alpha1.GroupName, Resource: AddOnDeploymentConfigResource, }, ConfigReferent: addonv1alpha1.ConfigReferent{ Name: v1beta1.FCCAddOnName, Namespace: spokeName, }, } - if slices.ContainsFunc(mca.Spec.Configs, func(c addonv1alpha1.AddOnConfig) bool { - return c.Group == desired.Group && - c.Resource == desired.Resource && - c.Name == desired.Name && - c.Namespace == desired.Namespace - }) { + if slices.ContainsFunc(mca.Spec.Configs, func(c addonv1alpha1.AddOnConfig) bool { + return c.ConfigGroupResource.Group == desired.ConfigGroupResource.Group && + c.ConfigGroupResource.Resource == desired.ConfigGroupResource.Resource && + c.ConfigReferent.Name == desired.ConfigReferent.Name && + c.ConfigReferent.Namespace == desired.ConfigReferent.Namespace + }) { return nil } mca.Spec.Configs = append(mca.Spec.Configs, desired) patchBytes, err := json.Marshal(map[string]any{ "spec": map[string]any{"configs": mca.Spec.Configs}, }) if err != nil { return fmt.Errorf("failed to marshal patch for fleetconfig-controller-manager: %v", err) } if _, err = addonC.AddonV1alpha1().ManagedClusterAddOns(spokeName).Patch( ctx, v1beta1.FCCAddOnName, types.MergePatchType, patchBytes, metav1.PatchOptions{}, ); err != nil { return fmt.Errorf("failed to patch fleetconfig-controller-manager: %v", err) } return nil }
fleetconfig-controller/internal/controller/v1beta1/constants.go
Outdated
Show resolved
Hide resolved
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (4)
fleetconfig-controller/internal/webhook/v1beta1/validation.go (2)
366-379: Return early if prior validation failed (avoid extra hub reads).If errs already contains hard failures, skip the hub lookup and return immediately.
if newObject.IsHubAsSpoke() { ... } else { ... } + if len(errs) > 0 { + return nil, errs + } + // try to get hub, if not present or not ready, log a warning that addons cant be properly validated
416-425: Typo: isAddondEnabled -> isAddonEnabled (optional).Rename for clarity; update call site accordingly.
fleetconfig-controller/api/v1beta1/constants.go (2)
99-101: Doc nit: clarify RoleNameEnvVar comment.Small grammar fix.
- // RoleNameEnvVar containing the name of the ClusterRole for fleetconfig-controller-manager. + // RoleNameEnvVar is the environment variable containing the name of the ClusterRole for fleetconfig-controller-manager.
155-161: Doc nits: fix typos in labels comments.Minor grammar/typo corrections.
- // ManagedByLabels are labeles applies to resources to denote that fleetconfig-controller is managing the lifecycle. + // ManagedByLabels are labels applied to resources to denote that fleetconfig-controller is managing the lifecycle. @@ - // ManagedBySelector is a label selector for filtering add-on resources managed fleetconfig-controller. + // ManagedBySelector is a label selector for filtering add-on resources managed by fleetconfig-controller.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
fleetconfig-controller/api/v1beta1/constants.go(4 hunks)fleetconfig-controller/internal/controller/v1beta1/addon.go(10 hunks)fleetconfig-controller/internal/controller/v1beta1/constants.go(1 hunks)fleetconfig-controller/internal/webhook/v1beta1/validation.go(4 hunks)fleetconfig-controller/internal/webhook/v1beta1/validation_test.go(4 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.
Applied to files:
fleetconfig-controller/internal/webhook/v1beta1/validation.go
📚 Learning: 2025-09-25T23:18:41.545Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.545Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.
Applied to files:
fleetconfig-controller/api/v1beta1/constants.gofleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: The waitForAddonManifestWorksCleanup function is called specifically during spoke cleanup/unjoin operations, only when addons were previously enabled, and after addon disable operations. In this controlled context, checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks since other ManifestWorks would have been handled earlier in the cleanup flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: In the fleetconfig-controller addon cleanup flow, waitForAddonManifestWorksCleanup is only called in contexts where non-addon ManifestWorks have already been handled, so checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/addon.go
🧬 Code graph analysis (3)
fleetconfig-controller/internal/webhook/v1beta1/validation.go (2)
fleetconfig-controller/api/v1beta1/constants.go (3)
SupportedHubAddons(168-171)FCCAddOnName(103-103)HubInitialized(19-19)fleetconfig-controller/api/v1beta1/spoke_types.go (2)
AddOn(241-253)HubRef(86-94)
fleetconfig-controller/internal/webhook/v1beta1/validation_test.go (1)
fleetconfig-controller/api/v1beta1/hub_types.go (4)
Hub(243-257)HubSpec(26-79)AddOnConfig(158-181)HubAddOn(184-198)
fleetconfig-controller/internal/controller/v1beta1/addon.go (3)
fleetconfig-controller/api/v1beta1/constants.go (4)
SupportedHubAddons(168-171)LabelAddOnManagedBy(121-121)FCCAddOnName(103-103)AddonArgoCD(164-164)fleetconfig-controller/vendor/open-cluster-management.io/api/addon/v1alpha1/register.go (2)
GroupName(10-10)Resource(26-28)fleetconfig-controller/internal/controller/v1beta1/constants.go (1)
AddOnDeploymentConfigResource(30-30)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
- GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
🔇 Additional comments (7)
fleetconfig-controller/internal/webhook/v1beta1/validation_test.go (1)
222-257: LGTM: error message updated to include kubeconfig.The updated allowed-fields message for Hub updates matches the validation logic.
fleetconfig-controller/internal/webhook/v1beta1/validation.go (1)
152-158: LGTM: enforce SupportedHubAddons.Validates HubAddOn names against the centralized list.
fleetconfig-controller/internal/controller/v1beta1/addon.go (3)
387-396: LGTM: FCC MCA patch gated and aggregated.Patching only when FCC addon is enabled and aggregating errors is sound.
690-699: Avoid brittle equality on total ManifestWorks; detect addon-owned works instead.Comparing len(manifestWorks) to 0/1 risks timeouts if unrelated works exist or counts fluctuate. Detect presence of addon-owned ManifestWorks via OwnerReferences and complete when none remain.
- // for hub-as-spoke, all addons must be removed. - // otherwise, fleetconfig-controller-manager must not be removed. - var expectedWorks = 1 - if isHubAsSpoke { - expectedWorks = 0 - } - - if len(manifestWorks.Items) == expectedWorks { + // Complete when no ManifestWorks remain that are owned by ManagedClusterAddOn + hasAddonWorks := slices.ContainsFunc(manifestWorks.Items, func(mw workv1.ManifestWork) bool { + return slices.ContainsFunc(mw.OwnerReferences, func(or metav1.OwnerReference) bool { + return or.Kind == managedClusterAddOn + }) + }) + if !hasAddonWorks { logger.V(1).Info("addon manifestWorks cleanup completed", "spokeName", spokeName, "remainingManifestWorks", len(manifestWorks.Items)) return true, nil }This matches the documented cleanup flow where non-addon works are handled earlier. Based on learnings
⛔ Skipped due to learnings
Learnt from: arturshadnik PR: open-cluster-management-io/lab#59 File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661 Timestamp: 2025-09-12T20:53:55.600Z Learning: The waitForAddonManifestWorksCleanup function is called specifically during spoke cleanup/unjoin operations, only when addons were previously enabled, and after addon disable operations. In this controlled context, checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks since other ManifestWorks would have been handled earlier in the cleanup flow.Learnt from: arturshadnik PR: open-cluster-management-io/lab#59 File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661 Timestamp: 2025-09-12T20:53:55.600Z Learning: In the fleetconfig-controller addon cleanup flow, waitForAddonManifestWorksCleanup is only called in contexts where non-addon ManifestWorks have already been handled, so checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks.Learnt from: arturshadnik PR: open-cluster-management-io/lab#69 File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133 Timestamp: 2025-09-25T23:18:41.545Z Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.
408-448: Fix field access and make patch idempotent with eventual-consistency wait.
- c.Group/Resource/Name/Namespace don’t exist on AddOnConfig; use nested fields.
- GET may race right after enable; poll for creation and avoid duplicate configs.
-func patchFCCMca(ctx context.Context, spokeName string, addonC *addonapi.Clientset) error { - mca, err := addonC.AddonV1alpha1().ManagedClusterAddOns(spokeName).Get(ctx, v1beta1.FCCAddOnName, metav1.GetOptions{}) - if err != nil { - return fmt.Errorf("failed to configure fleetconfig-controller-manager: %v", err) - } +func patchFCCMca(ctx context.Context, spokeName string, addonC *addonapi.Clientset) error { + // Wait briefly for MCA to appear after enable + var mca *addonv1alpha1.ManagedClusterAddOn + if err := wait.PollUntilContextTimeout(ctx, 1*time.Second, 30*time.Second, true, func(ctx context.Context) (bool, error) { + obj, err := addonC.AddonV1alpha1().ManagedClusterAddOns(spokeName).Get(ctx, v1beta1.FCCAddOnName, metav1.GetOptions{}) + if err != nil { + if kerrs.IsNotFound(err) { + return false, nil + } + return false, err + } + mca = obj + return true, nil + }); err != nil { + return fmt.Errorf("failed to locate ManagedClusterAddOn %q in %q: %v", v1beta1.FCCAddOnName, spokeName, err) + } desired := addonv1alpha1.AddOnConfig{ ConfigGroupResource: addonv1alpha1.ConfigGroupResource{ Group: addonv1alpha1.GroupName, Resource: AddOnDeploymentConfigResource, }, ConfigReferent: addonv1alpha1.ConfigReferent{ Name: v1beta1.FCCAddOnName, Namespace: spokeName, }, } - if slices.ContainsFunc(mca.Spec.Configs, func(c addonv1alpha1.AddOnConfig) bool { - return c.Group == desired.Group && - c.Resource == desired.Resource && - c.Name == desired.Name && - c.Namespace == desired.Namespace - }) { + if slices.ContainsFunc(mca.Spec.Configs, func(c addonv1alpha1.AddOnConfig) bool { + return c.ConfigGroupResource.Group == desired.ConfigGroupResource.Group && + c.ConfigGroupResource.Resource == desired.ConfigGroupResource.Resource && + c.ConfigReferent.Name == desired.ConfigReferent.Name && + c.ConfigReferent.Namespace == desired.ConfigReferent.Namespace + }) { return nil } mca.Spec.Configs = append(mca.Spec.Configs, desired) patchBytes, err := json.Marshal(map[string]any{ "spec": map[string]any{"configs": mca.Spec.Configs}, }) if err != nil { return fmt.Errorf("failed to marshal patch for fleetconfig-controller-manager: %v", err) } if _, err = addonC.AddonV1alpha1().ManagedClusterAddOns(spokeName).Patch( ctx, v1beta1.FCCAddOnName, types.MergePatchType, patchBytes, metav1.PatchOptions{}, ); err != nil { return fmt.Errorf("failed to patch fleetconfig-controller-manager: %v", err) } return nil }This aligns with prior feedback to make the patch resilient and idempotent. Based on learnings
fleetconfig-controller/api/v1beta1/constants.go (1)
93-95: LGTM: fixed HubNamespaceEnvVar comment.Comment now correctly references the Hub resource.
fleetconfig-controller/internal/controller/v1beta1/constants.go (1)
35-36: Label constant encodes key=value; split into key and value.Encoding the selector as "key=value" will produce an invalid label key when used in ObjectMeta.Labels. Define separate key and value constants.
- fccAddOnManifestWorkLabel = "open-cluster-management.io/addon-name=fleetconfig-controller-manager" + fccAddOnManifestWorkLabelKey = "open-cluster-management.io/addon-name" + fccAddOnManifestWorkLabelValue = "fleetconfig-controller-manager"Adjust call sites to set labels as: map[string]string{fccAddOnManifestWorkLabelKey: fccAddOnManifestWorkLabelValue}.
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
fleetconfig-controller/api/v1beta1/constants.go (2)
141-143: Doc fix: correct the URL key comment.The comment should reference AddonConfigMapManifestURLKey, not AddonConfigMapManifestRawKey.
Apply this diff:
- // AddonConfigMapManifestRawKey is the data key containing a URL to download manifests. + // AddonConfigMapManifestURLKey is the data key containing a URL to download manifests.
155-161: Typo in docstring for ManagedByLabels.“labeles applies” → “labels applied”.
Apply this diff:
- // ManagedByLabels are labeles applies to resources to denote that fleetconfig-controller is managing the lifecycle. + // ManagedByLabels are labels applied to resources to denote that fleetconfig-controller is managing the lifecycle.fleetconfig-controller/internal/webhook/v1beta1/validation.go (1)
347-351: Fix potential nil-pointer deref and typo (isAddondEnabled → isAddonEnabled).cr.DesiredConfig can be nil; dereferencing crashes the webhook. Also fix the function name typo.
Apply this diff:
- if isAddondEnabled(mcAddOns.Items, removedAddon) { + if isAddonEnabled(mcAddOns.Items, removedAddon) { inUseAddons = append(inUseAddons, removedAddon) } @@ -// isAddonConfigInUse checks if a removed addon config is still referenced by any ManagedClusterAddOn. -func isAddondEnabled(mcAddOns []addonv1alpha1.ManagedClusterAddOn, removedAddon string) bool { +// isAddonEnabled checks if a removed addon config is still referenced by any ManagedClusterAddOn. +func isAddonEnabled(mcAddOns []addonv1alpha1.ManagedClusterAddOn, removedAddon string) bool { for _, mcao := range mcAddOns { for _, cr := range mcao.Status.ConfigReferences { - if cr.DesiredConfig.Name == removedAddon { + if cr.DesiredConfig == nil { + continue + } + if cr.DesiredConfig.Name == removedAddon { return true } } } return false }Also applies to: 415-424
🧹 Nitpick comments (5)
fleetconfig-controller/internal/webhook/v1beta1/validation.go (1)
379-387: Optional: return early if add-on list is already invalid.Short-circuit on existing errs to skip hub lookup and reduce noisy warnings/work.
Apply this diff:
- // try to get hub, if not present or not ready, log a warning that addons cant be properly validated + // Bail out early if we already have hard validation failures. + if len(errs) > 0 { + return nil, errs + } + // Try to get hub; if not present or not ready, warn that add-ons can't be validated. hub := &v1beta1.Hub{}fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (4)
184-188: Use the central FCC add-on name constant to avoid drift.Prefer v1beta1.FCCAddOnName over a string literal.
Apply this diff:
- Name: "fleetconfig-controller-manager", + Name: v1beta1.FCCAddOnName,
353-356: Use the central FCC add-on name constant during cleanup.Avoid hardcoding the add-on name.
Apply this diff:
- if !spoke.IsHubAsSpoke() { - spokeCopy.Spec.AddOns = append(spokeCopy.Spec.AddOns, v1beta1.AddOn{ConfigName: "fleetconfig-controller-manager"}) // disable all except fcc - } + if !spoke.IsHubAsSpoke() { + // disable all except FCC + spokeCopy.Spec.AddOns = append(spokeCopy.Spec.AddOns, v1beta1.AddOn{ConfigName: v1beta1.FCCAddOnName}) + }
28-29: Unify registration driver constant to v1beta1 and drop the v1alpha1 import.Keeps constants in one API surface.
Apply this diff:
@@ - "github.com/open-cluster-management-io/lab/fleetconfig-controller/api/v1alpha1" "github.com/open-cluster-management-io/lab/fleetconfig-controller/api/v1beta1" @@ - if ra.Driver == v1alpha1.AWSIRSARegistrationDriver { + if ra.Driver == v1beta1.AWSIRSARegistrationDriver { raArgs := []string{ fmt.Sprintf("--registration-auth=%s", ra.Driver), }And remove the now-unused v1alpha1 import.
Also applies to: 585-597
952-955: Merge semantics vs comment: zero-values from spec are not ignored.mergo.Map with WithOverride will overwrite keys even with zero-values in specInterface. Either update the comment to reflect current behavior, or change the merge to skip zero-values from spec.
Apply this doc tweak if keeping current behavior:
- // Merge spec on top but ignore zero-values from spec + // Merge spec values on top (overriding overlapping keys). Note: zero-values in spec also override.Alternatively, prune zero-like values from specInterface before merging; I can provide a small helper to do that if desired.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
fleetconfig-controller/api/v1beta1/constants.go(4 hunks)fleetconfig-controller/internal/controller/v1beta1/constants.go(1 hunks)fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go(1 hunks)fleetconfig-controller/internal/webhook/v1beta1/validation.go(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- fleetconfig-controller/internal/controller/v1beta1/constants.go
🧰 Additional context used
🧠 Learnings (7)
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.
Applied to files:
fleetconfig-controller/internal/webhook/v1beta1/validation.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-25T23:18:41.545Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.545Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.
Applied to files:
fleetconfig-controller/api/v1beta1/constants.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-22T19:16:34.109Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/webhook/v1beta1/validation.go:103-121
Timestamp: 2025-09-22T19:16:34.109Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller v1beta1 API, the Klusterlet field in SpokeSpec is defined as a struct value (Klusterlet Klusterlet), not a pointer (*Klusterlet), so direct field access like Klusterlet.Annotations is safe without nil checks. The Klusterlet struct does not contain a Source field.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
🧬 Code graph analysis (2)
fleetconfig-controller/internal/webhook/v1beta1/validation.go (2)
fleetconfig-controller/api/v1beta1/constants.go (3)
SupportedHubAddons(172-175)FCCAddOnName(103-103)HubInitialized(19-19)fleetconfig-controller/api/v1beta1/spoke_types.go (2)
AddOn(241-253)HubRef(86-94)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (9)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
SpokeReconciler(46-52)fleetconfig-controller/api/v1beta1/spoke_types.go (5)
Spoke(281-295)KlusterletChartConfig(186-188)HubRef(86-94)AddOn(241-253)Klusterlet(107-169)fleetconfig-controller/api/v1beta1/constants.go (19)
ClusterTypeHub(76-76)ClusterTypeSpoke(79-79)SupportedClusterTypes(110-113)SpokeJoined(28-28)LabelManagedClusterType(118-118)ManagedClusterTypeHubAsSpoke(71-71)ControllerNamespaceEnvVar(97-97)HubNamespaceEnvVar(94-94)SpokeNamespaceEnvVar(91-91)AddonsConfigured(22-22)RoleNameEnvVar(100-100)DefaultFCCManagerRole(106-106)FCCAddOnName(103-103)PivotComplete(31-31)HubCleanupPreflightFinalizer(7-7)SpokeCleanupFinalizer(13-13)HubCleanupFinalizer(10-10)HubInitialized(19-19)AWSIRSARegistrationDriver(130-130)fleetconfig-controller/pkg/common/common.go (6)
ClusterClient(20-30)AddOnClient(59-69)GetManagedCluster(72-81)UpdateManagedCluster(84-89)WorkClient(46-56)OperatorClient(33-43)fleetconfig-controller/internal/kube/kube.go (3)
KubeconfigFromSecretOrCluster(113-119)RawFromInClusterRestConfig(75-81)RestConfigFromKubeconfig(22-32)fleetconfig-controller/internal/hash/hash.go (1)
ComputeHash(11-22)fleetconfig-controller/internal/args/args.go (2)
PrepareResources(32-55)PrepareKubeconfig(15-29)fleetconfig-controller/internal/exec/exec.go (1)
CmdWithLogs(18-54)fleetconfig-controller/internal/version/version.go (2)
LowestBundleVersion(15-46)Normalize(49-55)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
- GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (1)
fleetconfig-controller/api/v1beta1/constants.go (1)
151-153: Allowing http for add-on manifests may leak integrity.Do we intentionally allow plain http here? Consider restricting to https only or gating http behind a dev/debug flag.
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
Outdated
Show resolved
Hide resolved
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
Outdated
Show resolved
Hide resolved
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (5)
fleetconfig-controller/cmd/manager/manager.go (3)
166-177: Trim whitespace when validating required env vars.Rejecting only empty string allows whitespace-only values. Trim to avoid accidental misconfig.
Apply this diff:
- spokeNamespace := os.Getenv(apiv1beta1.SpokeNamespaceEnvVar) - if spokeNamespace == "" { + spokeNamespace := strings.TrimSpace(os.Getenv(apiv1beta1.SpokeNamespaceEnvVar)) + if spokeNamespace == "" { err = fmt.Errorf("CLUSTER_NAMESPACE environment variable must be set") setupLog.Error(err, "unable to create controller", "controller", "Spoke") return nil, err } - hubNamespace := os.Getenv(apiv1beta1.HubNamespaceEnvVar) - if hubNamespace == "" { + hubNamespace := strings.TrimSpace(os.Getenv(apiv1beta1.HubNamespaceEnvVar)) + if hubNamespace == "" { err = fmt.Errorf("HUB_NAMESPACE environment variable must be set") setupLog.Error(err, "unable to create controller", "controller", "Spoke") return nil, err }And add:
+ "strings"to imports.
150-151: Avoid constructing an unused webhook server in ForSpoke.You’re only using TLSOpts for metrics; creating a webhook.Server is unnecessary work.
Apply this diff:
- _, tlsOpts := setupServer(opts, setupLog) + tlsOpts := []func(*tls.Config){} + if !opts.EnableHTTP2 { + tlsOpts = append(tlsOpts, func(c *tls.Config) { setupLog.Info("disabling http/2"); c.NextProtos = []string{"http/1.1"} }) + }
274-277: Prefer parsing kubeconfig over raw string replacement.Byte-wise ReplaceAll may unintentionally rewrite unrelated content. Parse and update client-certificate/client-key fields for robustness.
Apply this diff:
- // patch the kubeconfig with absolute paths to cert/key - hubKubeconfig = bytes.ReplaceAll(hubKubeconfig, []byte("tls.crt"), []byte(certPath)) - hubKubeconfig = bytes.ReplaceAll(hubKubeconfig, []byte("tls.key"), []byte(keyPath)) + // patch kubeconfig with absolute cert/key paths + type kubeCfg struct { + Users []struct { + Name string `json:"name" yaml:"name"` + User struct { + ClientCertificate string `json:"client-certificate,omitempty" yaml:"client-certificate,omitempty"` + ClientKey string `json:"client-key,omitempty" yaml:"client-key,omitempty"` + } `json:"user" yaml:"user"` + } `json:"users" yaml:"users"` + } + var kc kubeCfg + if err := yaml.Unmarshal(hubKubeconfig, &kc); err == nil { + for i := range kc.Users { + if kc.Users[i].User.ClientCertificate != "" { + kc.Users[i].User.ClientCertificate = certPath + } + if kc.Users[i].User.ClientKey != "" { + kc.Users[i].User.ClientKey = keyPath + } + } + if patched, err := yaml.Marshal(kc); err == nil { + hubKubeconfig = patched + } + }Note: CA patching isn’t needed since OCM addon kubeconfigs embed certificate-authority-data.
Based on learningsfleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (2)
720-724: Use a constant for the ManagedClusterJoined condition.Avoid hardcoding the string; prefer the upstream constant if available to prevent typos.
Example:
if c.Type == clusterv1.ManagedClusterConditionJoined { ... }
972-979: Merge semantics comment vs. code clarity.The comment says “ignore zero-values from spec” which matches mergo’s default behavior (even with WithOverride, zero values are still ignored). Consider clarifying this to avoid confusion.
Proposed wording: “Merge spec on top; non-empty spec values override values from ConfigMap.”
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
fleetconfig-controller/api/v1beta1/spoke_types.go(2 hunks)fleetconfig-controller/cmd/manager/manager.go(1 hunks)fleetconfig-controller/internal/controller/v1beta1/addon.go(10 hunks)fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go(1 hunks)fleetconfig-controller/internal/webhook/v1beta1/validation.go(4 hunks)
🧰 Additional context used
🧠 Learnings (13)
📓 Common learnings
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.
📚 Learning: 2025-09-26T04:15:34.443Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/webhook/v1beta1/validation_test.go:699-701
Timestamp: 2025-09-26T04:15:34.443Z
Learning: In fleetconfig-controller/internal/webhook/v1beta1/validation_test.go, the TestValidateAddonUniqueness test has flawed assertion logic at lines 763-769 that uses `if len(err.Error()) >= len(expectedMsg)` for partial matching, which causes false positives when actual error messages are longer than expected messages, allowing tests to pass with completely incorrect expected error messages.
Applied to files:
fleetconfig-controller/internal/webhook/v1beta1/validation.go
📚 Learning: 2025-09-26T04:13:12.098Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/webhook/v1beta1/validation_test.go:719-724
Timestamp: 2025-09-26T04:13:12.098Z
Learning: The TestValidateAddonUniqueness function in fleetconfig-controller/internal/webhook/v1beta1/validation_test.go has flawed assertion logic that uses `len(err.Error()) >= len(expectedMsg)` to check error messages, which doesn't actually validate message content and will pass for any error longer than the expected message.
Applied to files:
fleetconfig-controller/internal/webhook/v1beta1/validation.go
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.
Applied to files:
fleetconfig-controller/internal/webhook/v1beta1/validation.gofleetconfig-controller/api/v1beta1/spoke_types.go
📚 Learning: 2025-09-25T23:18:41.545Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.545Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.
Applied to files:
fleetconfig-controller/api/v1beta1/spoke_types.gofleetconfig-controller/internal/controller/v1beta1/addon.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-22T19:16:34.109Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/webhook/v1beta1/validation.go:103-121
Timestamp: 2025-09-22T19:16:34.109Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller v1beta1 API, the Klusterlet field in SpokeSpec is defined as a struct value (Klusterlet Klusterlet), not a pointer (*Klusterlet), so direct field access like Klusterlet.Annotations is safe without nil checks. The Klusterlet struct does not contain a Source field.
Applied to files:
fleetconfig-controller/api/v1beta1/spoke_types.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-25T23:26:18.303Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/cmd/manager/manager.go:290-292
Timestamp: 2025-09-25T23:26:18.303Z
Learning: ManagedClusterAddOn hub kubeconfigs in Open Cluster Management use certificate-authority-data (base64-encoded inline CA certificate) rather than certificate-authority (file path reference). This makes the kubeconfig self-contained and eliminates the need to patch CA file paths.
Applied to files:
fleetconfig-controller/cmd/manager/manager.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: The waitForAddonManifestWorksCleanup function is called specifically during spoke cleanup/unjoin operations, only when addons were previously enabled, and after addon disable operations. In this controlled context, checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks since other ManifestWorks would have been handled earlier in the cleanup flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: In the fleetconfig-controller addon cleanup flow, waitForAddonManifestWorksCleanup is only called in contexts where non-addon ManifestWorks have already been handled, so checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
🧬 Code graph analysis (4)
fleetconfig-controller/internal/webhook/v1beta1/validation.go (2)
fleetconfig-controller/api/v1beta1/constants.go (3)
SupportedHubAddons(172-175)FCCAddOnName(103-103)HubInitialized(19-19)fleetconfig-controller/api/v1beta1/spoke_types.go (2)
AddOn(255-267)HubRef(87-95)
fleetconfig-controller/api/v1beta1/spoke_types.go (2)
fleetconfig-controller/api/v1beta1/constants.go (4)
ManagedClusterTypeHubAsSpoke(71-71)PivotComplete(31-31)SpokeJoined(28-28)SpokeCleanupFinalizer(13-13)fleetconfig-controller/api/v1beta1/common.go (1)
Kubeconfig(13-29)
fleetconfig-controller/internal/controller/v1beta1/addon.go (3)
fleetconfig-controller/api/v1beta1/constants.go (4)
SupportedHubAddons(172-175)LabelAddOnManagedBy(121-121)FCCAddOnName(103-103)AddonArgoCD(165-165)fleetconfig-controller/vendor/open-cluster-management.io/api/addon/v1alpha1/register.go (2)
GroupName(10-10)Resource(26-28)fleetconfig-controller/internal/controller/v1beta1/constants.go (1)
AddOnDeploymentConfigResource(30-30)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (9)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
SpokeReconciler(46-52)fleetconfig-controller/api/v1beta1/spoke_types.go (5)
Spoke(295-309)KlusterletChartConfig(200-202)HubRef(87-95)AddOn(255-267)Klusterlet(121-183)fleetconfig-controller/api/v1beta1/constants.go (19)
ClusterTypeHub(76-76)PivotComplete(31-31)ClusterTypeSpoke(79-79)SupportedClusterTypes(110-113)SpokeJoined(28-28)LabelManagedClusterType(118-118)ManagedClusterTypeHubAsSpoke(71-71)ControllerNamespaceEnvVar(97-97)HubNamespaceEnvVar(94-94)SpokeNamespaceEnvVar(91-91)AddonsConfigured(22-22)RoleNameEnvVar(100-100)DefaultFCCManagerRole(106-106)FCCAddOnName(103-103)HubCleanupPreflightFinalizer(7-7)SpokeCleanupFinalizer(13-13)HubCleanupFinalizer(10-10)HubInitialized(19-19)AWSIRSARegistrationDriver(130-130)fleetconfig-controller/pkg/common/common.go (6)
ClusterClient(20-30)AddOnClient(59-69)GetManagedCluster(72-81)UpdateManagedCluster(84-89)WorkClient(46-56)OperatorClient(33-43)fleetconfig-controller/internal/kube/kube.go (3)
KubeconfigFromSecretOrCluster(113-119)RawFromInClusterRestConfig(75-81)RestConfigFromKubeconfig(22-32)fleetconfig-controller/internal/hash/hash.go (1)
ComputeHash(11-22)fleetconfig-controller/internal/args/args.go (2)
PrepareResources(32-55)PrepareKubeconfig(15-29)fleetconfig-controller/internal/exec/exec.go (1)
CmdWithLogs(18-54)fleetconfig-controller/internal/version/version.go (2)
LowestBundleVersion(15-46)Normalize(49-55)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
- GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (10)
fleetconfig-controller/internal/controller/v1beta1/addon.go (1)
408-446: Reintroduce eventual-consistency handling when patching the MCARight after
clusteradm addon enable, theManagedClusterAddOnis often absent for a few seconds. The currentGetcall treats that as a hard failure, so enabling FCC on a busy or fresh spoke will routinely break again—the same race we fixed earlier. Please restore the short poll / NotFound tolerance so we wait for the MCA to appear before patching.Apply this diff to fix it:
func patchFCCMca(ctx context.Context, spokeName string, addonC *addonapi.Clientset) error { - mca, err := addonC.AddonV1alpha1().ManagedClusterAddOns(spokeName).Get(ctx, v1beta1.FCCAddOnName, metav1.GetOptions{}) - if err != nil { - return fmt.Errorf("failed to configure fleetconfig-controller-manager: %v", err) - } - desired := addonv1alpha1.AddOnConfig{ + // Wait briefly for the MCA to show up after enable + var mca *addonv1alpha1.ManagedClusterAddOn + if err := wait.PollUntilContextTimeout(ctx, time.Second, 30*time.Second, true, func(ctx context.Context) (bool, error) { + obj, err := addonC.AddonV1alpha1().ManagedClusterAddOns(spokeName).Get(ctx, v1beta1.FCCAddOnName, metav1.GetOptions{}) + if err != nil { + if kerrs.IsNotFound(err) { + return false, nil + } + return false, err + } + mca = obj + return true, nil + }); err != nil { + return fmt.Errorf("failed to locate ManagedClusterAddOn %q in %q: %v", v1beta1.FCCAddOnName, spokeName, err) + } + + desired := addonv1alpha1.AddOnConfig{ ConfigGroupResource: addonv1alpha1.ConfigGroupResource{ Group: addonv1alpha1.GroupName, Resource: AddOnDeploymentConfigResource, }, ConfigReferent: addonv1alpha1.ConfigReferent{fleetconfig-controller/internal/webhook/v1beta1/validation.go (2)
152-157: Nice guardrail on supported Hub add-ons.Validating the Hub add-on names against
v1beta1.SupportedHubAddonswill catch typos and unsanctioned entries up front while keeping the constant in one place. Clean, centralized check.
365-377: Good enforcement around the FCC add-on.Differentiating hub-as-spoke clusters from regular spokes and wiring both cases through the shared
v1beta1.FCCAddOnNameconstant keeps the admission rules tight and eliminates drift. Nicely done.fleetconfig-controller/api/v1beta1/spoke_types.go (2)
102-106: LGTM: Hub-as-Spoke detection looks consistent.Name check and in-cluster kubeconfig are reasonable signals for hub-as-spoke. No issues spotted.
107-118: LGTM: PivotComplete gating is sound.Requiring SpokeJoined=true, PivotComplete=true, and presence of SpokeCleanupFinalizer aligns with the deletion flow ordering and avoids premature hub cleanup.
Based on learnings
fleetconfig-controller/cmd/manager/manager.go (2)
265-268: Good fix: derive cert/key base path via filepath.Dir.This addresses prior fragility with TrimSuffix("kubeconfig").
259-283: CA handling is correct; no need to patch certificate-authority.OCM addon hub kubeconfigs use certificate-authority-data, so cert/key path rewrites are sufficient.
If you expect any kubeconfigs with certificate-authority (file path), confirm and we can extend the patcher. Otherwise this is good as-is. Based on learnings
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (3)
496-527: Double-check “self-destruct” namespace deletions.Deleting CONTROLLER_NAMESPACE on the spoke is intentional, but ensure the manager isn’t running against that same cluster at this point to avoid deleting itself mid-reconcile in a way that leaves dependent resources half-cleaned.
Would you like a preflight check that skips deleting the controller namespace if LeaderElection is still active?
671-672: Redact hub token in logs.joinArgs includes --hub-token; logging it leaks credentials.
Apply this diff:
- logger.V(1).Info("clusteradm join", "args", joinArgs) + // Redact sensitive args before logging + maskedArgs := make([]string, len(joinArgs)) + copy(maskedArgs, joinArgs) + for i := 0; i < len(maskedArgs)-1; i++ { + if maskedArgs[i] == "--hub-token" { + maskedArgs[i+1] = "REDACTED" + break + } + } + logger.V(1).Info("clusteradm join", "args", maskedArgs)
897-897: Stop logging the token payload.stdout contains the token JSON; don’t emit it.
Apply this diff:
- logger.V(1).Info("got join token", "output", string(stdout)) + logger.V(1).Info("got join token")
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
Outdated
Show resolved
Hide resolved
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
fleetconfig-controller/api/v1beta1/spoke_types.go(1 hunks)fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go(1 hunks)fleetconfig-controller/internal/webhook/v1beta1/validation.go(7 hunks)fleetconfig-controller/internal/webhook/v1beta1/validation_test.go(8 hunks)
🧰 Additional context used
🧠 Learnings (10)
📓 Common learnings
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.
📚 Learning: 2025-09-25T23:18:41.545Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.545Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.
Applied to files:
fleetconfig-controller/api/v1beta1/spoke_types.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.
Applied to files:
fleetconfig-controller/api/v1beta1/spoke_types.gofleetconfig-controller/internal/webhook/v1beta1/validation.gofleetconfig-controller/internal/webhook/v1beta1/validation_test.go
📚 Learning: 2025-09-22T19:16:34.109Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/webhook/v1beta1/validation.go:103-121
Timestamp: 2025-09-22T19:16:34.109Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller v1beta1 API, the Klusterlet field in SpokeSpec is defined as a struct value (Klusterlet Klusterlet), not a pointer (*Klusterlet), so direct field access like Klusterlet.Annotations is safe without nil checks. The Klusterlet struct does not contain a Source field.
Applied to files:
fleetconfig-controller/api/v1beta1/spoke_types.gofleetconfig-controller/internal/webhook/v1beta1/validation.gofleetconfig-controller/internal/webhook/v1beta1/validation_test.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-26T04:15:34.443Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/webhook/v1beta1/validation_test.go:699-701
Timestamp: 2025-09-26T04:15:34.443Z
Learning: In fleetconfig-controller/internal/webhook/v1beta1/validation_test.go, the TestValidateAddonUniqueness test has flawed assertion logic at lines 763-769 that uses `if len(err.Error()) >= len(expectedMsg)` for partial matching, which causes false positives when actual error messages are longer than expected messages, allowing tests to pass with completely incorrect expected error messages.
Applied to files:
fleetconfig-controller/internal/webhook/v1beta1/validation.gofleetconfig-controller/internal/webhook/v1beta1/validation_test.go
📚 Learning: 2025-09-26T04:13:12.098Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/webhook/v1beta1/validation_test.go:719-724
Timestamp: 2025-09-26T04:13:12.098Z
Learning: The TestValidateAddonUniqueness function in fleetconfig-controller/internal/webhook/v1beta1/validation_test.go has flawed assertion logic that uses `len(err.Error()) >= len(expectedMsg)` to check error messages, which doesn't actually validate message content and will pass for any error longer than the expected message.
Applied to files:
fleetconfig-controller/internal/webhook/v1beta1/validation.gofleetconfig-controller/internal/webhook/v1beta1/validation_test.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/webhook/v1beta1/validation.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/webhook/v1beta1/validation.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.
Applied to files:
fleetconfig-controller/internal/webhook/v1beta1/validation.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.
Applied to files:
fleetconfig-controller/internal/webhook/v1beta1/validation.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
🧬 Code graph analysis (4)
fleetconfig-controller/api/v1beta1/spoke_types.go (3)
fleetconfig-controller/api/v1alpha1/fleetconfig_types.go (2)
Spoke(350-392)Kubeconfig(301-317)fleetconfig-controller/api/v1beta1/constants.go (3)
ManagedClusterTypeHubAsSpoke(71-71)PivotComplete(31-31)SpokeJoined(28-28)fleetconfig-controller/api/v1beta1/common.go (1)
Kubeconfig(13-29)
fleetconfig-controller/internal/webhook/v1beta1/validation.go (2)
fleetconfig-controller/api/v1beta1/spoke_types.go (3)
Klusterlet(117-179)AddOn(251-263)HubRef(86-94)fleetconfig-controller/api/v1beta1/constants.go (3)
SupportedHubAddons(172-175)FCCAddOnName(103-103)HubInitialized(19-19)
fleetconfig-controller/internal/webhook/v1beta1/validation_test.go (1)
fleetconfig-controller/api/v1beta1/hub_types.go (4)
Hub(243-257)HubSpec(26-79)AddOnConfig(158-181)HubAddOn(184-198)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (9)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
SpokeReconciler(46-52)fleetconfig-controller/api/v1beta1/spoke_types.go (5)
Spoke(291-305)KlusterletChartConfig(196-198)HubRef(86-94)AddOn(251-263)Klusterlet(117-179)fleetconfig-controller/api/v1beta1/constants.go (19)
ClusterTypeHub(76-76)PivotComplete(31-31)ClusterTypeSpoke(79-79)SupportedClusterTypes(110-113)SpokeJoined(28-28)LabelManagedClusterType(118-118)ManagedClusterTypeHubAsSpoke(71-71)ControllerNamespaceEnvVar(97-97)HubNamespaceEnvVar(94-94)SpokeNamespaceEnvVar(91-91)AddonsConfigured(22-22)RoleNameEnvVar(100-100)DefaultFCCManagerRole(106-106)FCCAddOnName(103-103)HubCleanupPreflightFinalizer(7-7)SpokeCleanupFinalizer(13-13)HubCleanupFinalizer(10-10)HubInitialized(19-19)AWSIRSARegistrationDriver(130-130)fleetconfig-controller/pkg/common/common.go (6)
ClusterClient(20-30)AddOnClient(59-69)GetManagedCluster(72-81)UpdateManagedCluster(84-89)WorkClient(46-56)OperatorClient(33-43)fleetconfig-controller/internal/kube/kube.go (3)
KubeconfigFromSecretOrCluster(113-119)RawFromInClusterRestConfig(75-81)RestConfigFromKubeconfig(22-32)fleetconfig-controller/internal/hash/hash.go (1)
ComputeHash(11-22)fleetconfig-controller/internal/args/args.go (2)
PrepareResources(32-55)PrepareKubeconfig(15-29)fleetconfig-controller/internal/exec/exec.go (1)
CmdWithLogs(18-54)fleetconfig-controller/internal/version/version.go (2)
LowestBundleVersion(15-46)Normalize(49-55)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (8)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (2)
674-681: Redact the hub token before logging clusteradm args.Line 674 still emits
joinArgswith--hub-token <secret>, leaking credentials to the controller logs—the same issue we previously flagged. Please mask that entry before logging.- logger.V(1).Info("clusteradm join", "args", joinArgs) + maskedArgs := make([]string, len(joinArgs)) + copy(maskedArgs, joinArgs) + for i := 0; i < len(maskedArgs)-1; i++ { + if maskedArgs[i] == "--hub-token" { + maskedArgs[i+1] = "REDACTED" + break + } + } + logger.V(1).Info("clusteradm join", "args", maskedArgs)
900-901: Stop logging the token payload.
clusteradm get tokenwrites the hub join token JSON to stdout; loggingstring(stdout)at Line 900 still exposes that secret verbatim.- logger.V(1).Info("got join token", "output", string(stdout)) + logger.V(1).Info("got join token")fleetconfig-controller/internal/webhook/v1beta1/validation_test.go (3)
690-701: Fix the expected message for unsupported hub add-ons.When the validator sees
"custom-addon"it returnsinvalid hubAddOn name. must be one of ..., not the clash message currently asserted here. The test only “passes” because the helper compares string lengths. Please update the expectation to match the real error.- errMsgs: []string{"hubAddOn name argocd clashes with an existing addOnConfig name"}, + errMsgs: []string{"invalid hubAddOn name. must be one of"},
703-724: Correct the expected clash strings in the combined-conflicts scenario.This case still reports clashes on
argocd, yet the expected messages mentionshared-addon. The mismatch is hidden by the faulty assertion helper—please update the strings.- "hubAddOn name shared-addon clashes with an existing addOnConfig name", - "hubAddOn name shared-addon clashes with an existing addOnConfig name", + "hubAddOn name argocd clashes with an existing addOnConfig name", + "hubAddOn name argocd clashes with an existing addOnConfig name",
753-770: Replace the length check with a real substring match.Lines 757-764 still treat
len(actual) >= len(expected)as a “partial match”, so any longer error satisfies the assertion regardless of content. Switch tostrings.Contains(and importstrings) so the tests actually validate the message.-import ( - "testing" +import ( + "strings" + "testing" @@ - if err.Error() != "" && err.Error() != expectedMsg { - // For partial matching since error messages include indices - if len(expectedMsg) > 0 && err.Error() != "" { - // Check if the error message contains the expected substring - if len(err.Error()) >= len(expectedMsg) { - found = true - break - } - } - } else if err.Error() == expectedMsg { + if err.Error() == expectedMsg || (expectedMsg != "" && err.Error() != "" && strings.Contains(err.Error(), expectedMsg)) { found = true break }fleetconfig-controller/api/v1beta1/spoke_types.go (2)
101-104: Solid hub-as-spoke detection guardChecking both the canonical hub-as-spoke name and the InCluster kubeconfig flag keeps the special-case handling aligned with the new cluster typing story without introducing extra state.
106-114: PivotComplete now mirrors the observed condition flowRequiring
SpokeJoinedbefore honoringPivotCompletematches the reconciler’s readiness semantics and avoids premature “day 2” status flips.fleetconfig-controller/internal/webhook/v1beta1/validation.go (1)
370-404: Add-on validation now respects hub-as-spoke semantics and preserves hard failuresBlocking FCC enablement on hub-as-spoke spokes, requiring it elsewhere, and returning accumulated errors even when hub discovery falls back to a warning closes the loophole where bad specs slipped through on missing hubs. Nicely handled.
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
Outdated
Show resolved
Hide resolved
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
fleetconfig-controller/devspace.yaml (4)
163-171: Narrow imageSelector and make port mapping explicit for hub dev mappingTo avoid matching unintended pods and to be explicit about port-forwarding, include the tag and map local:remote.
Apply this diff:
- imageSelector: ${IMAGE_REPOSITORY} + imageSelector: ${IMAGE_REPOSITORY}:dev @@ - - port: ${HUB_PORT} + - port: ${HUB_PORT}:${HUB_PORT}Optional: add a labelSelector to target only the hub Deployment if hub/spoke share the same image.
189-198: Mirror the imageSelector/port improvements for spoke dev mappingSame rationale as hub: be explicit and narrow the match.
Apply this diff:
- imageSelector: ${IMAGE_REPOSITORY} + imageSelector: ${IMAGE_REPOSITORY}:dev @@ - - port: ${SPOKE_PORT} + - port: ${SPOKE_PORT}:${SPOKE_PORT}Optional: add a labelSelector targeting only the spoke Deployment.
Also applies to: 213-213
12-15: Confirmed port bindings in start scriptsdevspace-start-hub.sh listens on 2344 and devspace-start-spoke.sh on 2345. Consider replacing these literals with $HUB_PORT/$SPOKE_PORT for maintainability.
58-66: Add readiness waits & verify debug pipeline targets
In fleetconfig-controller/devspace.yaml’s
debuganddebug-spokepipelines, prepend:kubectl -n cert-manager wait deploy -l app.kubernetes.io/name=cert-manager --for=condition=Available --timeout=120s
Confirm both pipelines deploy into
fleetconfig-systemand use thefleetconfig-controller-managerdeployment name/labels when invokingcreate_deployments.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml(1 hunks)fleetconfig-controller/devspace.yaml(7 hunks)fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go(9 hunks)fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go(1 hunks)
🧰 Additional context used
🧠 Learnings (12)
📓 Common learnings
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.
📚 Learning: 2025-09-25T23:18:41.545Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.545Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-22T19:16:34.109Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/webhook/v1beta1/validation.go:103-121
Timestamp: 2025-09-22T19:16:34.109Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller v1beta1 API, the Klusterlet field in SpokeSpec is defined as a struct value (Klusterlet Klusterlet), not a pointer (*Klusterlet), so direct field access like Klusterlet.Annotations is safe without nil checks. The Klusterlet struct does not contain a Source field.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-22T18:42:03.404Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/PROJECT:28-31
Timestamp: 2025-09-22T18:42:03.404Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller, the PROJECT file defines multiple API resources with different webhook configurations: FleetConfig v1alpha1 has defaulting: true (requiring MutatingWebhookConfiguration), while Hub and Spoke v1beta1 resources have defaulting: false. MutatingWebhookConfiguration resources in the manifests serve the v1alpha1 FleetConfig, not the v1beta1 Hub/Spoke resources.
Applied to files:
fleetconfig-controller/devspace.yaml
📚 Learning: 2025-09-22T19:26:11.020Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/test/data/fleetconfig-v1alpha1.yaml:47-53
Timestamp: 2025-09-22T19:26:11.020Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller tests, the kubeconfigKey is intentionally set to "value" in test fixtures (fleetconfig-v1alpha1.yaml, fleetconfig-values.yaml) because that's how the test harness provisions the kubeconfig secret during test setup. This differs from the chart default of "kubeconfig" but is correct for the test environment.
Applied to files:
fleetconfig-controller/devspace.yaml
📚 Learning: 2025-08-27T21:58:32.141Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#58
File: fleetconfig-controller/charts/fleetconfig-controller/README.md:155-155
Timestamp: 2025-08-27T21:58:32.141Z
Learning: In the open-cluster-management-io/lab repository, the fleetconfig-controller follows a workflow where chart version bumps (in README.md and values.yaml) are included in PRs before the corresponding Docker image exists. The Docker image is built and pushed automatically via GitHub release workflows after the PR is merged and tagged, making the referenced version available.
Applied to files:
fleetconfig-controller/devspace.yaml
📚 Learning: 2025-09-25T23:31:11.561Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml:110-112
Timestamp: 2025-09-25T23:31:11.561Z
Learning: The fleetconfig-controller-manager spoke agent requires create/update/patch/delete permissions on CustomResourceDefinitions because `clusteradm upgrade klusterlet` operations need create/update permissions and cleanup operations require delete permissions for proper lifecycle management.
Applied to files:
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
🧬 Code graph analysis (2)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (2)
fleetconfig-controller/api/v1beta1/constants.go (8)
ClusterTypeHub(76-76)HubCleanupFinalizer(10-10)HubCleanupPreflightFinalizer(7-7)SpokeCleanupFinalizer(13-13)ClusterTypeSpoke(79-79)SupportedClusterTypes(110-113)PivotComplete(31-31)SpokeNameEnvVar(88-88)fleetconfig-controller/cmd/manager/manager.go (1)
Options(33-45)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (9)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
SpokeReconciler(46-52)fleetconfig-controller/api/v1beta1/spoke_types.go (5)
Spoke(291-305)KlusterletChartConfig(196-198)HubRef(86-94)AddOn(251-263)Klusterlet(117-179)fleetconfig-controller/api/v1beta1/constants.go (17)
ClusterTypeHub(76-76)ClusterTypeSpoke(79-79)SupportedClusterTypes(110-113)SpokeJoined(28-28)LabelManagedClusterType(118-118)ManagedClusterTypeHubAsSpoke(71-71)ControllerNamespaceEnvVar(97-97)HubNamespaceEnvVar(94-94)SpokeNamespaceEnvVar(91-91)AddonsConfigured(22-22)RoleNameEnvVar(100-100)DefaultFCCManagerRole(106-106)FCCAddOnName(103-103)HubCleanupPreflightFinalizer(7-7)SpokeCleanupFinalizer(13-13)HubCleanupFinalizer(10-10)HubInitialized(19-19)fleetconfig-controller/pkg/common/common.go (6)
ClusterClient(20-30)AddOnClient(59-69)GetManagedCluster(72-81)UpdateManagedCluster(84-89)WorkClient(46-56)OperatorClient(33-43)fleetconfig-controller/internal/kube/kube.go (3)
KubeconfigFromSecretOrCluster(113-119)RawFromInClusterRestConfig(75-81)RestConfigFromKubeconfig(22-32)fleetconfig-controller/internal/hash/hash.go (1)
ComputeHash(11-22)fleetconfig-controller/internal/args/args.go (2)
PrepareResources(32-55)PrepareKubeconfig(15-29)fleetconfig-controller/internal/exec/exec.go (1)
CmdWithLogs(18-54)fleetconfig-controller/internal/version/version.go (2)
LowestBundleVersion(15-46)Normalize(49-55)
🪛 YAMLlint (1.37.1)
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml
[error] 19-19: syntax error: expected , but found ''
(syntax)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (10)
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml (1)
19-20: Quote Helm-built names to unblock YAML lintUnquoted concatenations like
name: {{ include "chart.fullname" . }}-managerare still tripping YAMLlint (expected <block end>, but found '<scalar>'). Please wrap every similarname,serviceAccountName, andclusterRoleNameoccurrence in quotes (e.g.{{ printf "%s-manager" (include "chart.fullname" .) | quote }}) to render valid YAML. This applies here and at Lines 82-83, 116-120, 128-152, and 163-163.Apply this diff (excerpt):
- name: {{ include "chart.fullname" . }}-manager + name: {{ printf "%s-manager" (include "chart.fullname" .) | quote }} @@ - name: {{ include "chart.fullname" . }}-manager + name: {{ printf "%s-manager" (include "chart.fullname" .) | quote }} @@ - name: {{ include "chart.fullname" . }}-manager-spoke-rolebinding + name: {{ printf "%s-manager-spoke-rolebinding" (include "chart.fullname" .) | quote }} @@ - name: {{ include "chart.fullname" . }}-manager-spoke-role + name: {{ printf "%s-manager-spoke-role" (include "chart.fullname" .) | quote }} @@ - name: {{ include "chart.fullname" . }}-leader-election-role + name: {{ printf "%s-leader-election-role" (include "chart.fullname" .) | quote }} @@ - name: {{ include "chart.fullname" . }}-leader-election-rolebinding + name: {{ printf "%s-leader-election-rolebinding" (include "chart.fullname" .) | quote }} @@ - clusterRoleName: {{ include "chart.fullname" . }}-manager-role + clusterRoleName: {{ printf "%s-manager-role" (include "chart.fullname" .) | quote }}fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
250-256: ValidateCLUSTER_NAMEbefore adding the predicateIf this env var is empty we scope the controller to a zero-length name, so it never reconciles anything. Please fail fast instead of assuming other wiring already validated it.
func (r *SpokeReconciler) SetupWithManagerForSpoke(mgr ctrl.Manager) error { - spokeName := os.Getenv(v1beta1.SpokeNameEnvVar) // we know this is set, because the mgr setup would have failed otherwise + spokeName := strings.TrimSpace(os.Getenv(v1beta1.SpokeNameEnvVar)) + if spokeName == "" { + return fmt.Errorf("env %s must be set to the Spoke name", v1beta1.SpokeNameEnvVar) + } return ctrl.NewControllerManagedBy(mgr).Don't forget to add
stringsto the imports.fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (4)
305-313: Guard optionalClusterManagerbefore dereferencing
hub.Spec.ClusterManageris optional in the API; if it’s nil this panics the controller. Please nil-check (and reuse a localsourcevariable) before touching.Source.- if hub != nil && hub.Spec.ClusterManager.Source.BundleVersion != "" { - upgrade, err := r.spokeNeedsUpgrade(ctx, spoke, currKlusterletHash, hub.Spec.ClusterManager.Source, spokeKubeconfig) + if hub != nil && hub.Spec.ClusterManager != nil && hub.Spec.ClusterManager.Source.BundleVersion != "" { + source := hub.Spec.ClusterManager.Source + upgrade, err := r.spokeNeedsUpgrade(ctx, spoke, currKlusterletHash, source, spokeKubeconfig) if err != nil { return fmt.Errorf("failed to check if spoke cluster needs upgrade: %w", err) } if upgrade { - if err := r.upgradeSpoke(ctx, spoke, klusterletValues, hub.Spec.ClusterManager.Source, spokeKubeconfig); err != nil { + if err := r.upgradeSpoke(ctx, spoke, klusterletValues, source, spokeKubeconfig); err != nil {
590-592: Handle missingClusterManagerconfig before building join argsWe still dereference
hub.Spec.ClusterManagerwithout a guard. When that block is omitted (it’s optional) the join path will panic. Please detect the nil case and return a clear error instead of crashing.- "--bundle-version", hub.Spec.ClusterManager.Source.BundleVersion, - "--image-registry", hub.Spec.ClusterManager.Source.Registry, + }, spoke.BaseArgs()...) + + if hub.Spec.ClusterManager == nil { + return fmt.Errorf("hub %s/%s is missing clusterManager configuration", hub.Namespace, hub.Name) + } + source := hub.Spec.ClusterManager.Source + joinArgs = append(joinArgs, + "--bundle-version", source.BundleVersion, + "--image-registry", source.Registry,
684-685: Redact the hub token before logging join args
joinArgsincludes--hub-token <secret>, so this log leaks credentials. Please mask or drop the token argument before logging.- logger.V(1).Info("clusteradm join", "args", joinArgs) + maskedArgs := make([]string, len(joinArgs)) + copy(maskedArgs, joinArgs) + for i := 0; i < len(maskedArgs)-1; i++ { + if maskedArgs[i] == "--hub-token" { + maskedArgs[i+1] = "REDACTED" + break + } + } + logger.V(1).Info("clusteradm join", "args", maskedArgs)
910-911: Stop logging the raw join token payload
string(stdout)contains the token JSON, so this log also exposes credentials. Please drop the payload or replace it with an opaque message.- logger.V(1).Info("got join token", "output", string(stdout)) + logger.V(1).Info("got join token")fleetconfig-controller/devspace.yaml (4)
42-42: Dev pipeline now targets hub dev mapping — LGTMScoped start_dev reduces accidental spoke attachment; good change.
114-116: Chart wiring for fleetConfig.enabled — LGTMProfiles flip this via vars; consistent with prior workflow.
172-188: Sync whitelist for hub — LGTMTight include set minimizes sync churn; inclusion of devspace-start-hub.sh is correct.
137-144: Resolve: debug manifest directories exist
Bothfleetconfig-controller/config/devspace/hubandfleetconfig-controller/config/devspace/spokeare present.
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
fleetconfig-controller/api/v1beta1/spoke_types.go (1)
92-106: Polish the CleanupConfig doc comments.Nitpicks: fix “it’s” → “its” and drop the duplicate “will” so the comments read cleanly.
- // If true, the agent will attempt to garbage collect it's own namespace after the spoke cluster is unjoined. + // If true, the agent will attempt to garbage collect its own namespace after the spoke cluster is unjoined. @@ - // If set, the kubeconfig secret will will be automatically deleted after the agent has taken over managing the Spoke. + // If set, the kubeconfig secret will be automatically deleted after the agent has taken over managing the Spoke.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
fleetconfig-controller/api/v1beta1/constants.go(4 hunks)fleetconfig-controller/api/v1beta1/spoke_types.go(2 hunks)fleetconfig-controller/charts/fleetconfig-controller/templates/deployment-dev.yaml(1 hunks)fleetconfig-controller/config/devspace/hub/manager.yaml(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- fleetconfig-controller/charts/fleetconfig-controller/templates/deployment-dev.yaml
- fleetconfig-controller/config/devspace/hub/manager.yaml
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/PROJECT:28-31
Timestamp: 2025-09-22T18:42:03.404Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller, the PROJECT file defines multiple API resources with different webhook configurations: FleetConfig v1alpha1 has defaulting: true (requiring MutatingWebhookConfiguration), while Hub and Spoke v1beta1 resources have defaulting: false. MutatingWebhookConfiguration resources in the manifests serve the v1alpha1 FleetConfig, not the v1beta1 Hub/Spoke resources.
📚 Learning: 2025-09-25T23:18:41.573Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.573Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.
Applied to files:
fleetconfig-controller/api/v1beta1/spoke_types.gofleetconfig-controller/api/v1beta1/constants.go
📚 Learning: 2025-09-22T19:16:34.109Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/webhook/v1beta1/validation.go:103-121
Timestamp: 2025-09-22T19:16:34.109Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller v1beta1 API, the Klusterlet field in SpokeSpec is defined as a struct value (Klusterlet Klusterlet), not a pointer (*Klusterlet), so direct field access like Klusterlet.Annotations is safe without nil checks. The Klusterlet struct does not contain a Source field.
Applied to files:
fleetconfig-controller/api/v1beta1/spoke_types.go
🧬 Code graph analysis (1)
fleetconfig-controller/api/v1beta1/spoke_types.go (2)
fleetconfig-controller/api/v1beta1/constants.go (3)
ManagedClusterTypeHubAsSpoke(77-77)PivotComplete(31-31)SpokeJoined(28-28)fleetconfig-controller/api/v1beta1/common.go (1)
Kubeconfig(13-29)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
158-194: Pass the hub kubeconfig intoclusteradm accept.
clusteradm acceptis invoked without--kubeconfig, so it falls back to~/.kube/config. Inside the controller pod that file (andKUBECONFIG) are absent, which makes every accept attempt fail. Hand the hub kubeconfig/context toacceptCluster(like we already do forjoin/get token) and plumb it through so the command always targets the hub API.
🧹 Nitpick comments (2)
fleetconfig-controller/internal/watch/watch.go (2)
51-68: Consider adding panic recovery to prevent manager crashes.The Start method implements the manager.Runnable pattern correctly. However, if the check method or the handler function panics, Start will panic and crash the manager. Consider adding a recover() block in the ticker case to log panics and continue the loop, improving resilience.
Apply this diff to add panic recovery:
case <-ticker.C: + func() { + defer func() { + if r := recover(); r != nil { + w.log.Error(fmt.Errorf("panic: %v", r), "Watch check panicked", "name", w.name) + } + }() if err := w.check(ctx); err != nil { w.log.Error(err, "Watch check failed", "name", w.name) } + }() }Note: You'll need to import "fmt" for this change.
70-82: Consider adding timeouts for condition and handler execution.The check method correctly evaluates the condition and invokes the handler. However, if the condition or handler functions are slow or block indefinitely, this could delay subsequent ticks or cause the watcher to become unresponsive. Consider wrapping the condition and handler calls with a context timeout to prevent long-running operations from blocking the watcher.
Example:
func (w *ResourceWatcher) check(ctx context.Context) error { // Add timeout for condition check conditionCtx, cancel := context.WithTimeout(ctx, 30*time.Second) defer cancel() met, err := w.condition(conditionCtx, w.client) if err != nil { return err } if !met { return nil } w.log.V(1).Info("Condition met, executing handler", "name", w.name) // Add timeout for handler execution handlerCtx, cancel := context.WithTimeout(ctx, 30*time.Second) defer cancel() return w.handler(handlerCtx, w.client) }You may want to make the timeout configurable via the Config struct.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
fleetconfig-controller/api/v1beta1/constants.go(4 hunks)fleetconfig-controller/api/v1beta1/spoke_types.go(2 hunks)fleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yaml(2 hunks)fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml(1 hunks)fleetconfig-controller/config/devspace/spoke/manager.yaml(1 hunks)fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go(10 hunks)fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go(1 hunks)fleetconfig-controller/internal/watch/watch.go(1 hunks)
🧰 Additional context used
🧠 Learnings (9)
📓 Common learnings
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/PROJECT:28-31
Timestamp: 2025-09-22T18:42:03.404Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller, the PROJECT file defines multiple API resources with different webhook configurations: FleetConfig v1alpha1 has defaulting: true (requiring MutatingWebhookConfiguration), while Hub and Spoke v1beta1 resources have defaulting: false. MutatingWebhookConfiguration resources in the manifests serve the v1alpha1 FleetConfig, not the v1beta1 Hub/Spoke resources.
📚 Learning: 2025-09-25T23:18:41.573Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.573Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.
Applied to files:
fleetconfig-controller/charts/fleetconfig-controller/crds/fleetconfig.open-cluster-management.io_spokes.yamlfleetconfig-controller/api/v1beta1/spoke_types.gofleetconfig-controller/api/v1beta1/constants.gofleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
📚 Learning: 2025-08-22T19:38:49.769Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#52
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller_test.go:49-56
Timestamp: 2025-08-22T19:38:49.769Z
Learning: In the fleetconfig-controller project, the SpokeSpec and HubSpec structs in v1beta1 contain only optional fields (like Foo *string with omitempty tags), so creating these resources without populating the Spec field does not cause validation failures in tests.
Applied to files:
fleetconfig-controller/api/v1beta1/spoke_types.go
📚 Learning: 2025-09-22T19:16:34.109Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/webhook/v1beta1/validation.go:103-121
Timestamp: 2025-09-22T19:16:34.109Z
Learning: In the open-cluster-management-io/lab repository's fleetconfig-controller v1beta1 API, the Klusterlet field in SpokeSpec is defined as a struct value (Klusterlet Klusterlet), not a pointer (*Klusterlet), so direct field access like Klusterlet.Annotations is safe without nil checks. The Klusterlet struct does not contain a Source field.
Applied to files:
fleetconfig-controller/api/v1beta1/spoke_types.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.gofleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-25T23:31:11.630Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml:110-112
Timestamp: 2025-09-25T23:31:11.630Z
Learning: The fleetconfig-controller-manager spoke agent requires create/update/patch/delete permissions on CustomResourceDefinitions because `clusteradm upgrade klusterlet` operations need create/update permissions and cleanup operations require delete permissions for proper lifecycle management.
Applied to files:
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml
🧬 Code graph analysis (2)
fleetconfig-controller/api/v1beta1/spoke_types.go (2)
fleetconfig-controller/api/v1beta1/constants.go (3)
ManagedClusterTypeHubAsSpoke(77-77)PivotComplete(31-31)SpokeJoined(28-28)fleetconfig-controller/api/v1beta1/common.go (1)
Kubeconfig(13-29)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (9)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
SpokeReconciler(51-57)fleetconfig-controller/api/v1beta1/spoke_types.go (6)
Spoke(309-323)KlusterletChartConfig(214-216)HubRef(110-118)CleanupConfig(91-107)AddOn(269-281)Klusterlet(141-197)fleetconfig-controller/api/v1beta1/constants.go (23)
InstanceTypeManager(83-83)PivotComplete(31-31)InstanceTypeUnified(89-89)InstanceTypeAgent(86-86)SupportedInstanceTypes(135-139)KlusterletSynced(34-34)SpokeJoined(28-28)LabelManagedClusterType(151-151)ManagedClusterTypeHubAsSpoke(77-77)FCCAddOnName(116-116)ControllerNamespaceEnvVar(107-107)HubNamespaceEnvVar(104-104)SpokeNamespaceEnvVar(101-101)PurgeAgentNamespaceEnvVar(113-113)AddonsConfigured(22-22)HubCleanupPreflightFinalizer(7-7)SpokeCleanupFinalizer(13-13)HubCleanupFinalizer(10-10)OCMSpokeNamespaces(142-146)HubInitialized(19-19)AWSIRSARegistrationDriver(163-163)BundleVersionDefault(215-215)BundleVersionLatest(212-212)fleetconfig-controller/pkg/common/common.go (6)
ClusterClient(20-30)AddOnClient(59-69)GetManagedCluster(72-81)UpdateManagedCluster(84-89)WorkClient(46-56)OperatorClient(33-43)fleetconfig-controller/internal/kube/kube.go (3)
KubeconfigFromSecretOrCluster(113-119)RestConfigFromKubeconfig(22-32)RawFromInClusterRestConfig(75-81)fleetconfig-controller/internal/hash/hash.go (1)
ComputeHash(11-22)fleetconfig-controller/internal/args/args.go (2)
PrepareResources(32-55)PrepareKubeconfig(15-29)fleetconfig-controller/internal/exec/exec.go (1)
CmdWithLogs(18-54)fleetconfig-controller/internal/version/version.go (2)
LowestBundleVersion(15-46)Normalize(49-55)
🪛 Checkov (3.2.334)
fleetconfig-controller/config/devspace/spoke/manager.yaml
[medium] 1-58: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 1-58: Minimize the admission of root containers
(CKV_K8S_23)
🪛 YAMLlint (1.37.1)
fleetconfig-controller/charts/fleetconfig-controller/templates/ocm/fcc-addon/addon-template.yaml
[error] 2-2: syntax error: expected the node content, but found '-'
(syntax)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
- GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
🔇 Additional comments (3)
fleetconfig-controller/internal/watch/watch.go (3)
1-10: LGTM!Package declaration and imports are appropriate for a periodic resource watcher implementing manager.Runnable.
12-17: LGTM!Function type signatures follow idiomatic controller-runtime patterns with proper context propagation.
19-27: LGTM!Struct encapsulation is correct with all fields unexported, preventing external mutation.
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
268-299: CLUSTER_NAME validation is missing in manager setup.Line 269 assumes
CLUSTER_NAMEis validated during manager setup, but as previously identified,manager.goonly validatesCLUSTER_NAMESPACE(SpokeNamespaceEnvVar) and does not checkCLUSTER_NAME(SpokeNameEnvVar). Without this validation, the controller can start with an emptyspokeName, causing the predicate filter to silently match nothing and leaving the controller inert.Ensure
manager.govalidates thatos.Getenv(v1beta1.SpokeNameEnvVar)is non-empty before callingSetupWithManagerForSpoke.
🧹 Nitpick comments (2)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
228-265: Consider whether Hub status changes should trigger Spoke reconciliation.The Hub watch predicate now triggers reconciliation when
Hub.Statuschanges (line 256). Hub status updates typically don't affect Spoke configuration, and watching status can cause unnecessary reconciles if the status changes frequently.Consider removing the status comparison if Hub status changes don't require Spoke updates:
- return sharedFieldsChanged(oldHub.Spec.DeepCopy(), newHub.Spec.DeepCopy()) || - !reflect.DeepEqual(oldHub.Status, newHub.Status) + return sharedFieldsChanged(oldHub.Spec.DeepCopy(), newHub.Spec.DeepCopy())If Hub status changes do affect Spoke behavior, please add a comment explaining the dependency.
fleetconfig-controller/internal/watch/watch.go (1)
87-109: Consider running the first check immediately on startRight now the watcher waits a full interval before the first evaluation. That can delay handlers noticeably when the condition is already true at startup (common for readiness/cleanup flows). You could reuse the same panic guard by hoisting it into a small helper and invoking it once before entering the loop, e.g.:
@@ - for { - select { - case <-ctx.Done(): + run := func() { + defer func() { + if r := recover(); r != nil { + w.log.Error(fmt.Errorf("panic: %v", r), "Watch check panicked", "name", w.name) + } + }() + if err := w.check(ctx); err != nil { + w.log.Error(err, "Watch check failed", "name", w.name) + } + } + + run() + + for { + select { + case <-ctx.Done(): w.log.Info("Shutting down resource watcher", "name", w.name) return nil - case <-ticker.C: - func() { - defer func() { - if r := recover(); r != nil { - w.log.Error(fmt.Errorf("panic: %v", r), "Watch check panicked", "name", w.name) - } - }() - if err := w.check(ctx); err != nil { - w.log.Error(err, "Watch check failed", "name", w.name) - } - }() + case <-ticker.C: + run() } }Keeps the existing behaviour on ticks while removing the initial delay.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go(10 hunks)fleetconfig-controller/internal/watch/watch.go(1 hunks)
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-09-25T23:18:41.573Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:125-133
Timestamp: 2025-09-25T23:18:41.573Z
Learning: In the fleetconfig-controller spoke deletion flow, SpokeCleanupFinalizer is always removed before HubCleanupFinalizer. This means that checking for the existence of HubCleanupFinalizer in the deletion logic is sufficient regardless of cluster type, as any SpokeCleanupFinalizer would have already been removed by the time the hub cleanup runs.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go
🧬 Code graph analysis (1)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (6)
fleetconfig-controller/api/v1beta1/constants.go (14)
InstanceTypeManager(83-83)HubCleanupFinalizer(10-10)HubCleanupPreflightFinalizer(7-7)SpokeCleanupFinalizer(13-13)InstanceTypeUnified(89-89)InstanceTypeAgent(86-86)SupportedInstanceTypes(135-139)PivotComplete(31-31)KlusterletSynced(34-34)SpokeNameEnvVar(98-98)AgentCleanupWatcherName(131-131)SpokeNamespaceEnvVar(101-101)PurgeAgentNamespaceEnvVar(113-113)ControllerNamespaceEnvVar(107-107)fleetconfig-controller/api/v1beta1/common.go (2)
NewCondition(112-123)Condition(126-129)fleetconfig-controller/internal/watch/watch.go (3)
NewOrDie(78-84)Config(35-43)New(46-76)fleetconfig-controller/cmd/manager/manager.go (1)
Options(33-45)fleetconfig-controller/internal/kube/kube.go (2)
RawFromInClusterRestConfig(75-81)RestConfigFromKubeconfig(22-32)fleetconfig-controller/pkg/common/common.go (1)
WorkClient(46-56)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
- GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (6)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (5)
57-57: LGTM! InstanceType field appropriately added.The
InstanceTypefield enables instance-aware reconciliation logic, aligning with the new hub/spoke architecture. The field is properly utilized throughout the reconcile flow.
141-149: LGTM! Cleanup logic correctly gates on HubCleanupFinalizer.The deletion flow properly checks for
HubCleanupFinalizerbefore executing cleanup. Based on learnings,SpokeCleanupFinalizeris always removed beforeHubCleanupFinalizer, so this check is sufficient regardless of cluster type.Based on learnings.
170-172: Verify the semantic intent of KlusterletSynced wantStatus.The
KlusterletSyncedcondition is initialized withWantStatus: metav1.ConditionFalse, which is unusual. Typically, conditions should eventually reach aTruestate to indicate success. Please confirm:
- Is
KlusterletSyncedintended to represent an error/failure state (whereTruemeans out-of-sync)?- Or should the
WantStatusbemetav1.ConditionTrueto indicate successful synchronization?If the current behavior is correct, consider adding a comment explaining why
WantStatus=Falseis the desired steady state.
301-312: LGTM! Spoke deletion condition logic is correct.The
spokeDeletedConditionfunction correctly returnstruewhen the Spoke resource is not found, indicating successful deletion. Error handling appropriately ignoresNotFounderrors.
314-348: LGTM! Agent self-destruct logic uses safe defaults.The
agentSelfDestructfunction implements appropriate cleanup with defensive error handling:
- Falls back to orphaning the namespace (less destructive) if
PURGE_AGENT_NAMESPACEparsing fails- Only deletes the agent namespace when explicitly enabled
- Properly cleans up
AppliedManifestWorksConsider documenting the
PURGE_AGENT_NAMESPACEenvironment variable and its default behavior in user-facing documentation to clarify the namespace cleanup behavior.fleetconfig-controller/internal/watch/watch.go (1)
45-75: Validation guardrail looks solidGood call returning an error when required Config fields are missing; this closes the earlier gap and makes New safer to consume.
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
fleetconfig-controller/internal/watch/watch.go (1)
45-76: Consider validating or documenting the Name field.The constructor validates all critical fields, but
cfg.Nameis not validated. While an empty name won't cause runtime errors, it could make logs less useful for distinguishing multiple watchers. Consider either validating that Name is non-empty or documenting that it's optional.Apply this diff to add Name validation if it should be required:
if cfg.Interval <= 0 { return nil, errors.New("watch.Config.Interval must be positive") } + if cfg.Name == "" { + return nil, errors.New("watch.Config.Name must not be empty") + } timeout := cfg.TimeoutAlternatively, document in the Config struct comment that Name is optional.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
fleetconfig-controller/internal/watch/watch.go(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (6)
fleetconfig-controller/internal/watch/watch.go (6)
1-12: LGTM!Package declaration and imports are appropriate for a generic watcher implementation.
14-14: LGTM!The default timeout of 10 seconds is reasonable for watch operations.
16-43: LGTM!Type definitions are well-structured with appropriate encapsulation. Function types provide clear contracts for condition checking and handler execution.
78-85: LGTM!The
NewOrDieconstructor correctly implements the panic-on-error pattern with helpful error context.
87-111: LGTM!The
Startmethod correctly implements themanager.Runnableinterface with proper resource cleanup, context cancellation handling, and panic recovery. Logging errors and panics without stopping the watcher is appropriate for a resilient long-running component.
113-131: LGTM!The
checkmethod properly manages timeout contexts for both condition evaluation and handler execution. Separate timeouts for each operation provide good isolation and resource control.
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
fleetconfig-controller/internal/controller/v1beta1/addon.go (2)
410-413: ****This duplicates a prior review comment. The MCA may not exist immediately after enable due to eventual consistency. Add retry logic with wait.PollUntilContextTimeout to handle NotFound errors gracefully.
Based on past review comments.
424-428: ****This duplicates a prior review comment. The idempotency check accesses fields incorrectly.
AddOnConfigembedsConfigGroupResourceandConfigReferent, so you must accessc.ConfigGroupResource.Group,c.ConfigGroupResource.Resource,c.ConfigReferent.Name, andc.ConfigReferent.Namespaceinstead ofc.Group,c.Resource,c.Name,c.Namespace.Based on past review comments.
🧹 Nitpick comments (2)
fleetconfig-controller/internal/controller/v1beta1/addon.go (2)
702-713: Strengthen validation when!shouldCleanAll.When
expectedWorks == 1and!shouldCleanAll, the logic checks if the remaining ManifestWork is the FCC addon. However, if the label is missing (ok == false), the function continues polling indefinitely rather than failing fast. Consider returning an error if the remaining work is not the expected FCC addon.Apply this diff to fail fast on unexpected remaining works:
if len(manifestWorks.Items) == expectedWorks { if shouldCleanAll { logger.V(1).Info("addon manifestWorks cleanup completed", "spokeName", spokeName, "remainingManifestWorks", len(manifestWorks.Items)) return true, nil } mw := manifestWorks.Items[0] val, ok := mw.Labels[addonv1alpha1.AddonLabelKey] - if ok && val == v1beta1.FCCAddOnName { + if !ok || val != v1beta1.FCCAddOnName { + return false, fmt.Errorf("unexpected remaining ManifestWork: expected %s, got label=%q (ok=%t)", v1beta1.FCCAddOnName, val, ok) + } - logger.V(1).Info("addon manifestWorks cleanup completed", "spokeName", spokeName, "remainingManifestWork", mw.Name) - return true, nil - } + logger.V(1).Info("addon manifestWorks cleanup completed", "spokeName", spokeName, "remainingManifestWork", mw.Name) + return true, nil }
792-810: Consider using strategic merge patch for idempotency.The current logic overwrites the entire RoleBinding on update, which could cause issues if other controllers modify the binding. Using a strategic merge patch would be more robust and idempotent.
Apply this diff to use patch instead of update:
err := r.Create(ctx, binding, &client.CreateOptions{}) if err != nil { if !kerrs.IsAlreadyExists(err) { logger.Error(err, "failed to create role binding for addon") return err } - curr := &rbacv1.RoleBinding{} - err = r.Get(ctx, types.NamespacedName{Namespace: binding.Namespace, Name: binding.Name}, curr) - if err != nil { - logger.Error(err, "failed to get role binding for addon") - return err - } - binding.SetResourceVersion(curr.ResourceVersion) - err = r.Update(ctx, binding) - if err != nil { - logger.Error(err, "failed to update role binding for addon") - return err - } + // Use strategic merge patch to update only the fields we care about + patch := client.MergeFrom(&rbacv1.RoleBinding{ + ObjectMeta: metav1.ObjectMeta{ + Name: binding.Name, + Namespace: binding.Namespace, + }, + }) + if err = r.Patch(ctx, binding, patch); err != nil { + logger.Error(err, "failed to patch role binding for addon") + return err + } } return nil
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
fleetconfig-controller/internal/args/args.go(2 hunks)fleetconfig-controller/internal/args/args_test.go(1 hunks)fleetconfig-controller/internal/controller/v1alpha1/addon.go(6 hunks)fleetconfig-controller/internal/controller/v1alpha1/hub.go(4 hunks)fleetconfig-controller/internal/controller/v1alpha1/spoke.go(7 hunks)fleetconfig-controller/internal/controller/v1beta1/addon.go(15 hunks)fleetconfig-controller/internal/controller/v1beta1/hub_controller.go(7 hunks)fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go(1 hunks)
🧰 Additional context used
🧠 Learnings (7)
📚 Learning: 2025-09-26T04:15:34.475Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#69
File: fleetconfig-controller/internal/webhook/v1beta1/validation_test.go:699-701
Timestamp: 2025-09-26T04:15:34.475Z
Learning: In fleetconfig-controller/internal/webhook/v1beta1/validation_test.go, the TestValidateAddonUniqueness test has flawed assertion logic at lines 763-769 that uses `if len(err.Error()) >= len(expectedMsg)` for partial matching, which causes false positives when actual error messages are longer than expected messages, allowing tests to pass with completely incorrect expected error messages.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: The waitForAddonManifestWorksCleanup function is called specifically during spoke cleanup/unjoin operations, only when addons were previously enabled, and after addon disable operations. In this controlled context, checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks since other ManifestWorks would have been handled earlier in the cleanup flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T20:53:55.600Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/addon.go:641-661
Timestamp: 2025-09-12T20:53:55.600Z
Learning: In the fleetconfig-controller addon cleanup flow, waitForAddonManifestWorksCleanup is only called in contexts where non-addon ManifestWorks have already been handled, so checking for zero total ManifestWorks is equivalent to checking for zero addon ManifestWorks.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/addon.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
🧬 Code graph analysis (7)
fleetconfig-controller/internal/controller/v1alpha1/hub.go (1)
fleetconfig-controller/internal/args/args.go (3)
PrepareResources(44-67)PrepareKubeconfig(27-41)SanitizeArgs(82-102)
fleetconfig-controller/internal/controller/v1beta1/hub_controller.go (3)
fleetconfig-controller/api/v1beta1/common.go (2)
NewCondition(112-123)Kubeconfig(13-29)fleetconfig-controller/api/v1beta1/constants.go (3)
HubUpgradeFailed(37-37)BundleVersionDefault(215-215)BundleVersionLatest(212-212)fleetconfig-controller/internal/args/args.go (3)
SanitizeArgs(82-102)PrepareResources(44-67)PrepareKubeconfig(27-41)
fleetconfig-controller/internal/args/args_test.go (1)
fleetconfig-controller/internal/args/args.go (2)
Redacted(15-15)SanitizeArgs(82-102)
fleetconfig-controller/internal/controller/v1beta1/addon.go (6)
fleetconfig-controller/api/v1beta1/constants.go (7)
SupportedHubAddons(205-208)LabelAddOnManagedBy(154-154)FCCAddOnName(116-116)BundleVersionLatest(212-212)AddonArgoCD(198-198)ClusterRoleNameEnvVar(110-110)DefaultFCCManagerRole(119-119)fleetconfig-controller/internal/args/args.go (1)
SanitizeArgs(82-102)fleetconfig-controller/vendor/open-cluster-management.io/api/addon/v1alpha1/register.go (2)
GroupName(10-10)Resource(26-28)fleetconfig-controller/internal/controller/v1beta1/constants.go (1)
AddOnDeploymentConfigResource(30-30)fleetconfig-controller/vendor/open-cluster-management.io/api/addon/v1alpha1/types_managedclusteraddon.go (1)
AddonLabelKey(240-240)fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
SpokeReconciler(52-58)
fleetconfig-controller/internal/controller/v1alpha1/addon.go (1)
fleetconfig-controller/internal/args/args.go (1)
SanitizeArgs(82-102)
fleetconfig-controller/internal/controller/v1alpha1/spoke.go (2)
fleetconfig-controller/internal/args/args.go (4)
SanitizeArgs(82-102)PrepareKubeconfig(27-41)Redacted(15-15)PrepareResources(44-67)fleetconfig-controller/internal/exec/exec.go (1)
CmdWithLogs(18-54)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (9)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
SpokeReconciler(52-58)fleetconfig-controller/api/v1beta1/spoke_types.go (6)
Spoke(309-323)KlusterletChartConfig(214-216)HubRef(110-118)CleanupConfig(91-107)AddOn(269-281)Klusterlet(141-197)fleetconfig-controller/api/v1beta1/constants.go (23)
InstanceTypeManager(83-83)PivotComplete(31-31)InstanceTypeUnified(89-89)InstanceTypeAgent(86-86)SupportedInstanceTypes(135-139)KlusterletSynced(34-34)SpokeJoined(28-28)LabelManagedClusterType(151-151)ManagedClusterTypeHubAsSpoke(77-77)FCCAddOnName(116-116)ControllerNamespaceEnvVar(107-107)HubNamespaceEnvVar(104-104)SpokeNamespaceEnvVar(101-101)PurgeAgentNamespaceEnvVar(113-113)AddonsConfigured(22-22)HubCleanupPreflightFinalizer(7-7)SpokeCleanupFinalizer(13-13)HubCleanupFinalizer(10-10)OCMSpokeNamespaces(142-146)HubInitialized(19-19)AWSIRSARegistrationDriver(163-163)BundleVersionDefault(215-215)BundleVersionLatest(212-212)fleetconfig-controller/pkg/common/common.go (6)
ClusterClient(20-30)AddOnClient(59-69)GetManagedCluster(72-81)UpdateManagedCluster(84-89)WorkClient(46-56)OperatorClient(33-43)fleetconfig-controller/internal/kube/kube.go (3)
KubeconfigFromSecretOrCluster(113-119)RestConfigFromKubeconfig(22-32)RawFromInClusterRestConfig(75-81)fleetconfig-controller/internal/hash/hash.go (1)
ComputeHash(11-22)fleetconfig-controller/internal/args/args.go (4)
PrepareResources(44-67)PrepareKubeconfig(27-41)SanitizeArgs(82-102)Redacted(15-15)fleetconfig-controller/internal/exec/exec.go (1)
CmdWithLogs(18-54)fleetconfig-controller/internal/version/version.go (2)
LowestBundleVersion(15-46)Normalize(49-55)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
- GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (17)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (17)
38-66: LGTM!The cleanup function correctly handles different instance types and the previous context extraction issue has been resolved.
68-120: LGTM!The handleSpoke function correctly orchestrates hub and spoke work based on instance type with appropriate error handling.
280-302: LGTM!The createAgentNamespace function correctly handles namespace creation with proper error checking for AlreadyExists.
304-325: LGTM!The deleteKubeconfigSecret function correctly handles secret deletion with appropriate guards and error handling.
327-364: LGTM!The doSpokeWork function correctly handles spoke-side upgrade logic with proper nil guards for ClusterManager.
366-501: LGTM!The doHubCleanup function implements a comprehensive cleanup sequence with proper error handling and finalizer management.
503-576: LGTM!The doSpokeCleanup function correctly handles spoke-side cleanup with appropriate logic for different cleanup scenarios.
762-775: LGTM!The getJoinedCondition function correctly retrieves the joined condition with appropriate nil checks.
777-838: LGTM!The spokeNeedsUpgrade function implements comprehensive upgrade detection logic with proper handling of special bundleVersion values.
840-883: LGTM!The upgradeSpoke function correctly handles klusterlet upgrades with proper argument sanitization and logging.
885-915: LGTM!The unjoinSpoke function correctly handles spoke deregistration with proper argument sanitization.
917-952: LGTM!The getToken function correctly retrieves the join token and properly redacts sensitive information in logs using
arg_utils.Redacted.
954-975: LGTM!The getHubMeta function correctly retrieves Hub metadata with appropriate kubeconfig loading based on instance type.
977-1045: LGTM!The mergeKlusterletValues function implements comprehensive value merging logic with proper error handling and clear log messages for missing ConfigMaps or keys.
1047-1066: LGTM!The prepareKlusterletValuesFile function correctly handles nil and empty values with appropriate cleanup function management.
158-164: We need to inspectSpoke.BaseArgsto confirm whether it passes the hub kubeconfig toclusteradm. I'll fetch its implementation.
588-730: SanitizeArgs correctly redacts the hub token
ThesensitiveKeyslist includes “hub-token”, soSanitizeArgs(joinArgs)will redact it. Logging the join command’s stdout is safe as it does not include sensitive data.
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
fleetconfig-controller/internal/args/args.go (1)
103-128: Critical: Inline--flag=valuesecrets not sanitized.The current implementation only redacts values when the flag and value are in separate slice elements (e.g.,
["--hub-token", "secret"]). Inline forms like--hub-token=secretare NOT sanitized and will be logged in full, potentially exposing tokens and other secrets. This is a critical security gap that undermines the goal of this refactor.A previous review comment on lines 81-127 correctly identified this issue and provided a fix. Please apply the suggested patch to detect and redact inline
--key=valueforms by:
- Checking if an argument contains
=- Splitting on the first
=to extract the key- Normalizing the key and checking if it's sensitive
- If sensitive, replacing the value after
=withRedactedApply this diff to address the issue:
func sanitizeSlice(words []string) []string { if len(words) == 0 { return []string{} } // Start with a copy of all args out := make([]string, len(words)) copy(out, words) // Iterate through and redact values following sensitive keys for i := 0; i < len(words); i++ { + // Handle inline --key=value forms + if key, _, hasEq := strings.Cut(words[i], "="); hasEq && isSensitiveKey(key) { + out[i] = key + "=" + Redacted + continue + } + // Handle space-separated --key value forms if isSensitiveKey(words[i]) { // Check if there's a next element to redact if i+1 < len(words) { out[i+1] = Redacted i++ // Skip the next element since we just redacted it } } } return out }Additionally, add a test case covering inline sensitive arguments to prevent regression:
func TestSanitizeArgs_InlineForm(t *testing.T) { args := []string{"--hub-token=secret123", "--other-flag", "value"} sanitized := SanitizeArgs(args) expected := []string{"--hub-token=REDACTED", "--other-flag", "value"} if !reflect.DeepEqual(sanitized, expected) { t.Errorf("Expected %v, got %v", expected, sanitized) } }
🧹 Nitpick comments (1)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (1)
885-915: Consider removing commented code.Line 909 contains a commented condition check for
amwExistsError. If this check is no longer needed (because the cleanup logic indoHubCleanupnow properly handles addon manifestwork cleanup before unjoin), remove the commented code to improve readability. If it might be needed in the future, add a TODO comment explaining why.Apply this diff if the check is no longer needed:
stdout, stderr, err := exec_utils.CmdWithLogs(ctx, cmd, fmt.Sprintf("waiting for 'clusteradm unjoin' to complete for spoke %s...", spoke.GetName())) out := append(stdout, stderr...) - if err != nil { //|| strings.Contains(string(out), amwExistsError) { + if err != nil { return fmt.Errorf("failed to unjoin spoke cluster %s: %v, output: %s", spoke.GetName(), err, string(out)) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
fleetconfig-controller/internal/args/args.go(2 hunks)fleetconfig-controller/internal/controller/v1alpha1/spoke.go(7 hunks)fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go(1 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early, making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream prepareKlusterletValuesFile function properly handles nil values by returning early with (nil, nil, nil), making additional nil checks unnecessary and potentially harmful to the intended flow.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions (prepareKlusterletValuesFile and hash.ComputeHash) properly handle nil values - prepareKlusterletValuesFile returns early with (nil, nil, nil) and hash.ComputeHash uses ZeroNil: true option to handle nil as zero values.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
📚 Learning: 2025-09-12T22:46:57.106Z
Learnt from: arturshadnik
PR: open-cluster-management-io/lab#59
File: fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go:328-331
Timestamp: 2025-09-12T22:46:57.106Z
Learning: In fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go, nil klusterletValues returned by mergeKlusterletValues is a valid state, not an error condition. The downstream functions handle nil properly: prepareKlusterletValuesFile returns early with (nil, nil, nil) for nil values, and hash.ComputeHash uses ZeroNil: true option to treat nil as zero values when computing hashes.
Applied to files:
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go
🧬 Code graph analysis (2)
fleetconfig-controller/internal/controller/v1alpha1/spoke.go (1)
fleetconfig-controller/internal/args/args.go (4)
SanitizeArgs(82-84)PrepareKubeconfig(27-41)Redacted(16-16)PrepareResources(44-67)
fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (9)
fleetconfig-controller/internal/controller/v1beta1/spoke_controller.go (1)
SpokeReconciler(52-58)fleetconfig-controller/api/v1beta1/spoke_types.go (6)
Spoke(309-323)KlusterletChartConfig(214-216)HubRef(110-118)CleanupConfig(91-107)AddOn(269-281)Klusterlet(141-197)fleetconfig-controller/api/v1beta1/constants.go (18)
InstanceTypeManager(83-83)PivotComplete(31-31)InstanceTypeUnified(89-89)InstanceTypeAgent(86-86)KlusterletSynced(34-34)SpokeJoined(28-28)LabelManagedClusterType(151-151)ManagedClusterTypeHubAsSpoke(77-77)FCCAddOnName(116-116)ControllerNamespaceEnvVar(107-107)HubCleanupPreflightFinalizer(7-7)SpokeCleanupFinalizer(13-13)HubCleanupFinalizer(10-10)OCMSpokeNamespaces(142-146)HubInitialized(19-19)AWSIRSARegistrationDriver(163-163)BundleVersionDefault(215-215)BundleVersionLatest(212-212)fleetconfig-controller/pkg/common/common.go (6)
ClusterClient(20-30)AddOnClient(59-69)GetManagedCluster(72-81)UpdateManagedCluster(84-89)WorkClient(46-56)OperatorClient(33-43)fleetconfig-controller/internal/kube/kube.go (3)
KubeconfigFromSecretOrCluster(113-119)RestConfigFromKubeconfig(22-32)RawFromInClusterRestConfig(75-81)fleetconfig-controller/internal/hash/hash.go (1)
ComputeHash(11-22)fleetconfig-controller/internal/args/args.go (3)
PrepareResources(44-67)PrepareKubeconfig(27-41)SanitizeArgs(82-84)fleetconfig-controller/internal/exec/exec.go (1)
CmdWithLogs(18-54)fleetconfig-controller/internal/version/version.go (2)
LowestBundleVersion(15-46)Normalize(49-55)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: e2e (fleetconfig-controller) / e2e
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
- GitHub Check: test (fleetconfig-controller) / Run Helm Chart Tests
- GitHub Check: test (fleetconfig-controller) / Ensure PR is reviewable & run tests
🔇 Additional comments (30)
fleetconfig-controller/internal/controller/v1alpha1/spoke.go (8)
28-28: LGTM!The aliased import
arg_utilsis clear and consistently used throughout the file for sanitization and argument preparation utilities.
243-243: LGTM!Sanitizing arguments before logging prevents accidental exposure of sensitive data in logs.
281-289: LGTM!Using
PrepareKubeconfigandSanitizeArgsensures that sensitive kubeconfig data and token arguments are handled securely and not leaked in logs.
297-297: LGTM!Using
arg_utils.Redactedto mask the token output is a critical security improvement that prevents accidental token exposure in logs.
333-333: LGTM!Centralizing resource argument preparation via
PrepareResourcesimproves maintainability and consistency across command invocations.
413-421: LGTM!Using
PrepareKubeconfigandSanitizeArgsensures that kubeconfig data and join arguments (including tokens) are handled securely and not exposed in logs.
530-538: LGTM!Using
PrepareKubeconfigandSanitizeArgsensures that kubeconfig data and upgrade arguments are handled securely and not exposed in logs.
590-598: LGTM!Using
PrepareKubeconfigandSanitizeArgsensures that kubeconfig data and unjoin arguments are handled securely and not exposed in logs.fleetconfig-controller/internal/args/args.go (4)
7-8: LGTM!The addition of
slicesandstringsimports is appropriate for the sanitization logic implemented in this file.
15-23: LGTM!The
Redactedconstant andsensitiveKeyslist provide a clear and maintainable foundation for sanitization logic.
81-84: LGTM!The
SanitizeArgsfunction provides a clean public API for sanitizing argument slices.
86-101: LGTM!The
SanitizeOutputfunction provides a clean public API for sanitizing command output by tokenizing, sanitizing, and rejoining text.fleetconfig-controller/internal/controller/v1beta1/spoke_handler.go (18)
38-66: LGTM!The cleanup routing logic correctly handles all three instance types (Manager, Unified, Agent) with appropriate parameters. The panic for unknown instance types is acceptable since this is guarded at manager initialization time.
68-120: LGTM!The spoke handling orchestration correctly routes work based on instance type, properly manages klusterlet values merging, and sets appropriate conditions on errors. The hub-as-spoke special case is handled correctly in Manager mode.
227-231: Verify error handling intent for kubeconfig secret deletion.The code returns an error from
deleteKubeconfigSecret, which will cause the reconcile to fail and requeue. If this deletion is critical for the join flow to proceed (e.g., for security compliance in AddOn mode), this behavior is correct. However, if the secret deletion can fail without blocking the join, consider logging the error without returning it.Based on the PR objectives stating that AddOn mode removes the kubeconfig secret for improved security, this appears to be a critical operation that should block reconciliation if it fails.
Can you confirm whether kubeconfig secret deletion failure should block reconciliation, or if it can be made non-blocking with a warning log?
280-302: LGTM!The agent namespace creation correctly handles the case where the namespace already exists, and properly uses the environment variable for the namespace name.
304-325: LGTM!The kubeconfig secret deletion logic correctly implements the security improvement described in the PR objectives, with appropriate guards to ensure it only deletes when in Manager mode with a successful pivot and when the secret should be purged.
327-364: LGTM!The spoke work handling correctly implements upgrade checking with proper nil guards for ClusterManager, appropriate hash-based change detection, and proper condition management. The nil ClusterManager check at line 345 correctly addresses the past review concern.
Based on learnings
366-501: LGTM!The hub cleanup logic correctly orchestrates the deregistration sequence with proper checks for manifestWork ownership, addon cleanup, finalizer coordination, and resource removal. The differentiation between hub-as-spoke/non-pivoted cleanup and normal AddOn cleanup is handled appropriately.
503-576: LGTM!The spoke cleanup logic correctly coordinates with hub cleanup via finalizers, handles the different kubeconfig sources based on pivot status, and performs appropriate additional cleanup in Agent mode when the operator should be purged.
578-586: LGTM!The helper types are well-defined and appropriately structured for holding token metadata and hub configuration.
732-760: Clarify whether acceptCluster should use explicit hub kubeconfig.A past review comment suggested passing hub kubeconfig explicitly to
acceptClusterto avoid relying on ambient context. However, this function is only called fromdoHubWorkin Manager/Unified mode where the controller should have proper hub access via its default client configuration. Thespoke.BaseArgs()method includes context information if needed.If
clusteradm acceptuses the ambient kubeconfig (via KUBECONFIG env or ~/.kube/config) and the controller is running in-cluster on the hub, this should work correctly. However, if there are scenarios where the ambient context might be incorrect, explicit kubeconfig passing would be safer.Can you clarify whether
clusteradm acceptneeds explicit hub kubeconfig configuration, or if the ambient context is guaranteed to be correct in all deployment scenarios?
762-775: LGTM!The condition retrieval logic is straightforward with appropriate nil guards.
777-838: LGTM!The upgrade detection logic correctly handles hash-based change detection, special bundle version keywords, and actual version comparison using normalized semver. The approach of using the lowest bundle version from the klusterlet spec is appropriate for detecting if all components are at the desired version.
840-883: LGTM!The upgrade logic correctly constructs the upgrade command with appropriate flags, handles temporary files properly with deferred cleanup, and sanitizes logs to avoid exposing sensitive information.
917-952: LGTM!The token retrieval logic correctly addresses the past security concern by not logging the sensitive token payload. The use of
arg_utils.Redactedat line 945 ensures that sensitive data is not exposed in logs. The nil check for ClusterManager is appropriate.
954-975: LGTM!The hub metadata retrieval correctly handles instance-type specific behavior, only loading the hub kubeconfig when needed (not in Agent mode). The error handling appropriately ignores NotFound errors.
977-1045: LGTM!The klusterlet values merging logic correctly handles all cases (nil values, missing ConfigMap/key, spec-only values, ConfigMap-only values, and merged values) with appropriate warnings and error handling. The corrected log message at line 1003 accurately describes the missing key scenario. The merge strategy appropriately prioritizes spec values over ConfigMap values.
Based on learnings
1047-1066: LGTM!The temporary values file preparation correctly handles nil and empty values cases, properly manages temporary file creation and cleanup, and provides appropriate error messages.
Based on learnings
588-730: SanitizeArgs correctly redacts the --hub-token valueConfirmed that
sensitiveKeysincludes “hub-token”, sosanitizeSlicereplaces the token argument withREDACTED.
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
Signed-off-by: Artur Shad Nik <arturshadnik@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice - much better :)
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: arturshadnik, TylerGillson The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
ed8ffbb
into
open-cluster-management-io:main
Description
This PR introduces a new mode of operation "FleetConfig controller as an AddOn". In this mode, after a Spoke has been successfully joined to the Hub, a
fleetconfig-controller-agentwill be installed in the spoke cluster as an OCM addon. Once installed, the agent will take over managing the cluster for all upgrade and cleanup activities. This allows for decentralized management of Spokes, where the control plane is used only for configuration and addon management. The benefits of this mode are:AddOn mode is optional, and can be disabled by setting
addonMode: falsein the Helm values during install. When disabled, FleetConfig controller will operate exactly as before, with the hub'sfleetconfig-controller-managerrunning inunifiedmode and handling the entire Spoke lifecycle.Limitations
At this time, AddOn mode is not supported for EKS hubs. See open-cluster-management-io/ocm#1052
Testing
Dev testing
The following scenarios have been validated:
hub-as-spoke(both modes)hub-as-spokeandspoke. Intermittently dropping all network traffic between the containers for varied lengths of time to simulate an airgap. (addon mode only)For each scenario, the passing criteria was:
spoke.spec.klusterlet.valuesresult in a successful upgrade of KlusterletAutomated testing
Additional assertions have been added to the E2E suite to ensure that the kubeconfig secret can be deleted, Klusterlet upgrades succeed, and no resources become orphaned
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Chores