Skip to content

Add logical cluster migration#4191

Open
ntnn wants to merge 26 commits into
kcp-dev:mainfrom
ntnn:ws-migration
Open

Add logical cluster migration#4191
ntnn wants to merge 26 commits into
kcp-dev:mainfrom
ntnn:ws-migration

Conversation

@ntnn
Copy link
Copy Markdown
Member

@ntnn ntnn commented Jun 5, 2026

Summary

This PR adds migrating logical clusters between shards.
The intent of an admin to migrate an LC is expressed using LogicalClusterMigration.migration.kcp.io.
The object is replicated to the cache-server and used to synchronize the migration between the origin and destination shard.
The destination shards pulls the data of the migrating LC using the LogicalClusterDump type through the migrating VW.
The migrating VW was added only to not put the pressure of the migration on the front-proxy as well.

More details on the process are in the reconciler package docs.

The happy path works but it still needs some resilience and we'll probably hit some real blockers once we try it on real infrastructure.
Also I haven't put in guard rails yet for who can actually create lcmigration etcpp. But I don't want the PR to get even bigger than it is.

What Type of PR Is This?

/kind feature

Related Issue(s)

Fixes #

Release Notes

Added migrating logical clusters between shards gated by the LogicalClusterMigration feature gate

@kcp-ci-bot kcp-ci-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. dco-signoff: yes Indicates the PR's author has signed the DCO. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API labels Jun 5, 2026
@kcp-ci-bot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign embik for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kcp-ci-bot kcp-ci-bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jun 5, 2026
@ntnn
Copy link
Copy Markdown
Member Author

ntnn commented Jun 5, 2026

/test pull-kcp-test-e2e

flake

 === FAIL: test/e2e/authorizer TestShardMetricsEndpoint (30.36s) 

@kcp-ci-bot kcp-ci-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 5, 2026
@ntnn ntnn added this to tbd Jun 5, 2026
@ntnn ntnn moved this to In review in tbd Jun 5, 2026
s.blockDeltas.Lock()
defer s.blockDeltas.Unlock()

if s.ignoreFunc != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why doesn't handleBatchDeltas need to special-case ReplacedAllInfo deltas, similar to how handleDeltas does?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReplacedAllInfo isn't in the batchable map in the FIFO kube implementation, so it that type can't flow through the handleBatchDeltas. It'd make sense to be defensive here and add the implementation there as well but I decided not to do that because I wanted to keep the modifications small-ish. On top I don't think ReplacedAllInfo will be batched because it is a batching type by itself.

Comment thread pkg/informer/informer.go
continue
}
for _, obj := range objs {
if err := store.Delete(obj); err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible the handlers that use this informer are still processing this object? It sounds a bit racy :D I wonder if it could be moved into the controller (the caller, using this informer), purge outside of the reconcile func while holding a lock. So that all threads can observe the same events.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yesn't :D There is a race but I don't think the transition has to be perfect.

The ignore simply tells the informers to ignore any further events from that cluster, the purge cluster then cleans out the stores to prevent e.g. lists from seeing the (stale) objects; so arguable all handlers are observing the same events. I was thinking of wiring ignoreFunc into filterFunc (filterFunc filters events sent to handlers) but decided against it because filterFunc ignores new watched events, so it ignores all events for all handlers equally, while filterFunc could take hold while events are being fanned out to handlers.

And for cases where an object from a migrating cluster is already in the workqueue of a handler we can't do anything except hope that the handler is implemnented correctly and discards the object once it is gone. At that point I'd bet more on eventual consistency rather than trying to make the split perfect.

}

// ForceRelist causes the underlying reflector to perform a full relist
// on the next list/watch cycle.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... causes the underlying reflector to perform a full relist IF the said reflector also implements this magical forceRelister interface.

I think more context could be added into the comment. With the single user of this being the LC migration controller, it's not super obvious (without further digging) WHY is this even here.

// lastSyncResourceVersionMutex guards read/write access to lastSyncResourceVersion
lastSyncResourceVersionMutex sync.RWMutex
// activeWatch holds the current watch so ForceRelist can stop it.
activeWatch watch.Interface
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nit, but could this be thisWatch, just watch or something like that? :D While reading, I was wondering that if this is THE active watch, I was looking for where is the other, inactive watch.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just watch clashes with the imported package watch (even if its a field^^ I try not to clash names) and imho thisWatch and activeWatch have the same problem, but activeWatch is a bit more speaking because that is the actively used watch^^
But we could use cur(rent)Watch e.g.

@kcp-ci-bot kcp-ci-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 5, 2026
ntnn added 13 commits June 8, 2026 09:07
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
ntnn added 13 commits June 8, 2026 09:07
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
Signed-off-by: Nelo-T. Wallus <red.brush9525@fastmail.com>
Signed-off-by: Nelo-T. Wallus <n.wallus@sap.com>
The paths aren't equal and making a shared list for two entries is excessive.
@kcp-ci-bot
Copy link
Copy Markdown
Contributor

@ntnn: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kcp-test-e2e-shared b5456ad link true /test pull-kcp-test-e2e-shared

Full PR test history

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Comment on lines +127 to +135
// <prefix>/<group>/<resource>/customresources/<lc>/...
if parts[2] == "customresources" {
if len(parts) < 4 {
return "", ""
}
return prefix + parts[0] + "/" + parts[1] + "/customresources/" + parts[3], logicalcluster.Name(parts[3]) + "/"
}
// <prefix>/<group>/<resource>/<lc>/...
return prefix + parts[0] + "/" + parts[1] + "/" + parts[2], logicalcluster.Name(parts[2]) + "/"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This handles resources originating from CRDs and built-ins, but not identity-based.

We'll need something like what we have in https://github.com/kcp-dev/kubernetes/blob/kcp-1.36.0/staging/src/k8s.io/apiserver/pkg/storage/etcd3/store_kcp.go#L113

I'm wondering though how we can do that from the key alone; we have far less information than the apiextensions-apiserver code above ^^

I think we can assume a few things and say that:

  • After all the trimming you're doing, the key we have is:
    []string{
      "<Group>",
      "<Resource>",
      [ "customresources" or "<Identity>" ,  ]
      "<Cluster>",
      [ "<Namespace>" , ]
      "<Name>",
    }
    
  • Cluster name can be at the 3rd or 2nd index (0-counting) depending if the resource is built-in or not.
    • Since we don't know which, we should check both against != targetLogicalCluster+"/". If there is a match, we have a good key.
    • Downside is that we theoretically could have a false-positive match. But the only way to do that is if <Identity> matches targetLogicalCluster.
      • Consequence of that would be that an APIExport owner could theoretically forge such etcd keys (or rather those who bind to it) that they would be deleted even if the cluster is not matching. IMHO this is not a HUGE deal (and btw we should add salt to the identity hash function :D).

}

// belongsToLogicalCluster reports whether an etcd key belongs to the given logical cluster.
func belongsToLogicalCluster(prefix, key string, target logicalcluster.Name) bool {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same problem as in https://github.com/kcp-dev/kcp/pull/4191/changes#r3373299200

As a side note, could this be refactored into a single place?

//
// TODO: paginate via spec.Continue / status.Continue so a single dump call
// doesn't have to hold all keys in memory.
func scanEtcdEntries(ctx context.Context, etcdClient *clientv3.Client, storagePrefix string, target logicalcluster.Name) ([]migrationv1alpha1.EtcdEntry, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth refactoring with scanEtcdEntries pkg/reconciler/migrating/logicalclustermigration/datacleanup.go ?

writeError(w, r, apierrors.NewUnauthorized("no user info"))
return
}
if !slices.Contains(user.GetGroups(), bootstrappolicy.SystemExternalLogicalClusterAdmin) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth documenting.

"net/http"
)

const migrationDumpHandlerPath = "/apis/migration.kcp.io/v1alpha1/logicalclusterdumps"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this migrationDumpHandlerPath const could moved to somewhere else, along with the one at pkg/virtual/migratingworkspaces/doc.go ? Like - if, as a first-time-reader, I list all references to this const, it would be nice see that this is used in both of those places; makes it a bit easier to orient oneself too :)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that and produced an import cycle :D
But I didn't want to make a separate package just for this constant, but it would be better yeah. I was thinking of maybe placing it in a helper package in the API.

"github.com/kcp-dev/kcp/pkg/virtual/migratingworkspaces"
)

func BuildVirtualWorkspace(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth adding a comment what's the purpose of this VW; what you've summarized in the PR description is a good starting point. Otherwise whoever reads this in a year's time will try to delete this :D

"github.com/kcp-dev/kcp/test/integration/framework"
)

var configMapGVR = schema.GroupVersionResource{Group: "", Version: "v1", Resource: "configmaps"}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This global is guaranteed to get overwritten by some unfortunate person with something wrong. Move to the func?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true - I think I only did this because the linter complained about the same variable definition :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dco-signoff: yes Indicates the PR's author has signed the DCO. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants