fix(mgmt): fail loud on missing federation kubeconfig; rename federation client#120
Merged
scotwells merged 1 commit intoMay 29, 2026
Conversation
…Client Management controllers (WorkloadDeploymentFederator, InstanceProjector) now refuse to start when --enable-management-controllers is set but --federation-kubeconfig is omitted, logging a clear error and exiting 1. Previously the controllers were silently skipped — the same fail-open-silent class as the quota P1 issue — leaving federation and instance projection broken with no operator-visible signal. Alongside the fail-loud guard, rename the Karmada/federation client identifiers to a neutral "federation" framing (FederationClient, federationRestConfig, --federation-kubeconfig / FEDERATION_KUBECONFIG) across all three controllers, cmd/main.go, and the kustomize base manifests. The previous --upstream-kubeconfig flag is removed; deployments must migrate to --federation-kubeconfig. Update all comments to match. Coordination note: once this artifact is deployed, management-plane and edge/lab deployments must set FEDERATION_KUBECONFIG (infra PRs in parallel). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
cfddb1d to
6ae41d4
Compare
553af62
into
feat/federated-deployment-scheduling
8 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Management controllers now fail fast at startup — with a clear error and immediate exit — when
--enable-management-controllersis set but--federation-kubeconfigis not. Previously,WorkloadDeploymentFederatorandInstanceProjectorwere silently skipped, so federation and instance-status projection were invisibly broken: workloads appeared to schedule but never actually federated to edge cells, with no log signal and no alert. This is the same fail-open-silent pattern as the quota P1 (fixed in #118).Why it matters
An operator who sets
--enable-management-controllers=truebut mis-wires the federation kubeconfig env var gets a hard failure at pod startup — visible in pod logs, surfaced immediately in a rollout — rather than a degraded system that silently does nothing for hours or until a workload is traced end-to-end.What changed
Fail-loud guard (
cmd/main.go): immediately after kubeconfig loading, if--enable-management-controllersis true and--federation-kubeconfigwas not provided, the process logs"management controllers enabled but no federation kubeconfig configured"withhint: set --federation-kubeconfigand exits 1. The guard fires before any manager or controller setup, so the failure is instant and unambiguous. The--enable-cell-controllerspath is unaffected.Federation client rename: the Karmada client was named
UpstreamClient/--upstream-kubeconfig/UPSTREAM_KUBECONFIG— all directional names that only make sense from one vantage point. These are renamed toFederationClient/--federation-kubeconfig/FEDERATION_KUBECONFIGacrosscmd/main.go, the three controller structs (WorkloadDeploymentFederator,InstanceProjector,InstanceReconciler), their tests, and the kustomize base manifests. ThesetupManagementControllershelper introduced in #118 is updated in place. All comments describing the Karmada plane as the "downstream control plane" are corrected.Deployment coordination required: once this image is deployed, management-plane and edge/lab deployments must supply
FEDERATION_KUBECONFIG(notUPSTREAM_KUBECONFIG). Paired with infra #2622 (management plane) and #2623 (edge/lab), which already setFEDERATION_KUBECONFIG.Test plan
go build ./...— cleango vet ./...— cleanmake test— all tests pass (controller, config, validation, stateful instancecontrol)make lint(golangci-lint v2.12.2) — 0 issuesc1c6261(post-fix(quota): Enforce and harden project quota for edge-cell Instances #118);staticcheckandgocycloissues from the original base are resolved--enable-management-controllersand no--federation-kubeconfig→ expect immediate exit 1 with clear error in logs🤖 Generated with Claude Code