design-proposal: cross-cluster mesh for tenant access to host services by kvaps · Pull Request #7 · cozystack/community

kvaps · 2026-05-04T17:21:56Z

Summary

Adds a design proposal for cross-cluster connectivity between Cozystack-managed tenant clusters and the host cluster.

The motivating use case: a host cluster running Ceph (managed by Rook) that should be reachable from inside tenant clusters as if it were local storage. Standard single-gateway approaches (Submariner, Kilo's default mesh-granularity=location) bottleneck Ceph traffic; this proposal uses Kilo's mesh-granularity=cross (squat/kilo#328) to build a node-to-node mesh that scales linearly with cluster size and handles Rook-driven failover without controller intervention on the data path.

The proposal covers:

Topology and why cross-mesh fits Ceph's traffic patterns
A new operator (cozystack-meshlink-operator) and TenantMeshLink CRD for managing Peer objects on both sides
Trust model (one-way: Cozystack → tenant; tenants have no host-cluster API access)
IP allocation: no new address space; existing pod-CIDRs are sufficient
Failure semantics, edge cases, and alternatives considered

Looking for feedback on the open questions, especially the upstream Kilo PR #328 strategy and whether tenant-side Kilo should be a hard requirement.

Test plan

This is a design proposal; no code yet. Implementation testing is scoped in the proposal and will follow in implementation PRs:

Unit tests for reconciliation logic
Admission webhook tests for pod-CIDR overlap detection
Integration tests with kind (two clusters)
E2E with real Rook + tenant cluster + Ceph CSI

Propose a controller-driven design that wires Cozystack tenant clusters into a node-to-node WireGuard mesh with the host cluster, using Kilo's mesh-granularity=cross topology. The motivating use case is exposing a Rook-managed Ceph cluster to tenant pods. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

coderabbitai · 2026-05-04T17:22:06Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2e32b34d-3be4-4442-9731-a42feb48c5cd

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request introduces a design proposal for a cross-cluster mesh using Kilo to allow tenant clusters to access host-cluster services like Ceph. The design utilizes a bipartite node-to-node topology managed by a new operator. The review feedback provides several technical improvements, including addressing MTU overhead for WireGuard, analyzing scalability limits of the N x M mesh, implementing fallback logic for node endpoints, using finalizers for robust resource cleanup, and expanding IP disjointness checks to include Service CIDRs.

gemini-code-assist · 2026-05-04T17:23:16Z

+
+### Topology
+
+Both the host cluster and every participating tenant cluster run Kilo with `--mesh-granularity=cross`. In this mode every node is a topology segment of one. Within a single logical location (e.g. all nodes inside one cluster) traffic uses the underlying CNI without WireGuard. Across logical locations every node holds a direct WireGuard tunnel to every node in the other location.


The proposal should address MTU configuration for the cross-cluster mesh. Since WireGuard adds encapsulation overhead (typically 60-80 bytes), packets from pods using the default 1500 MTU will exceed the tunnel MTU, leading to fragmentation or packet loss. The design should specify how this will be handled, for example, by configuring the Kilo interface MTU and ensuring MSS clamping is active or by adjusting the CNI MTU in the tenant clusters.

gemini-code-assist · 2026-05-04T17:23:16Z

+
+Both the host cluster and every participating tenant cluster run Kilo with `--mesh-granularity=cross`. In this mode every node is a topology segment of one. Within a single logical location (e.g. all nodes inside one cluster) traffic uses the underlying CNI without WireGuard. Across logical locations every node holds a direct WireGuard tunnel to every node in the other location.
+
+For the host ↔ tenant pair, the result is a full bipartite mesh: every tenant node has a tunnel to every host node, and vice versa. The number of tunnels is `N × M` where N is the tenant node count and M is the host node count; this is intentional and is what enables the throughput and HA properties described below.


The N x M bipartite mesh topology may face scalability challenges as the number of nodes increases. For instance, a 100-node host cluster and a 100-node tenant cluster would result in 10,000 WireGuard peers per node. The proposal should include an analysis of the practical limits for the number of peers the kg-agent and the Linux kernel can manage before performance or control-plane stability is impacted.

gemini-code-assist · 2026-05-04T17:23:16Z

+For each `TenantMeshLink`, the operator:
+
+1. Validates `spec.podCIDR` against all other `TenantMeshLink` objects and the host cluster's pod-CIDR; any overlap sets `PodCIDRConflict=True` and aborts further reconciliation for that tenant.
+2. Lists host cluster Nodes; for each node, ensures a `Peer` exists in the tenant cluster with: `publicKey` from the `kilo.squat.ai/wireguard-public-key` annotation, `endpoint` from `kilo.squat.ai/force-endpoint`, and `allowedIPs` containing the node's per-node pod-CIDR.


The operator should have a fallback strategy if the kilo.squat.ai/force-endpoint annotation is missing on a host node. Without a defined endpoint, tenant nodes will not be able to initiate the WireGuard handshake. Consider falling back to the Node's ExternalIP or InternalIP, or surfacing a specific error in the TenantMeshLink status.

Suggested change

2. Lists host cluster Nodes; for each node, ensures a `Peer` exists in the tenant cluster with: `publicKey` from the `kilo.squat.ai/wireguard-public-key` annotation, `endpoint` from `kilo.squat.ai/force-endpoint`, and `allowedIPs` containing the node's per-node pod-CIDR.

2. Lists host cluster Nodes; for each node, ensures a Peer exists in the tenant cluster with: publicKey from the kilo.squat.ai/wireguard-public-key annotation, endpoint from kilo.squat.ai/force-endpoint (falling back to Node IP if missing), and allowedIPs containing the node's per-node pod-CIDR.

gemini-code-assist · 2026-05-04T17:23:16Z

+1. Validates `spec.podCIDR` against all other `TenantMeshLink` objects and the host cluster's pod-CIDR; any overlap sets `PodCIDRConflict=True` and aborts further reconciliation for that tenant.
+2. Lists host cluster Nodes; for each node, ensures a `Peer` exists in the tenant cluster with: `publicKey` from the `kilo.squat.ai/wireguard-public-key` annotation, `endpoint` from `kilo.squat.ai/force-endpoint`, and `allowedIPs` containing the node's per-node pod-CIDR.
+3. Lists tenant cluster Nodes; for each node, ensures a `Peer` exists in the host cluster with: `publicKey` from the tenant Node's annotation, `allowedIPs` containing the tenant per-node pod-CIDR, no `endpoint` (the tenant initiates).
+4. Removes orphaned Peer objects on either side using a label selector tied to the `TenantMeshLink` name.


To ensure that Peer objects created in the tenant cluster are reliably cleaned up when a TenantMeshLink is deleted, the operator should use Kubernetes finalizers. Without finalizers, if the operator is unavailable or the tenant API is unreachable during deletion, orphaned Peer resources will remain in the tenant cluster.

Suggested change

4. Removes orphaned Peer objects on either side using a label selector tied to the `TenantMeshLink` name.

4. Uses finalizers to ensure all remote Peer objects are removed from the tenant cluster before the TenantMeshLink is deleted.\n5. Removes orphaned Peer objects on either side using a label selector tied to the TenantMeshLink name.

gemini-code-assist · 2026-05-04T17:23:16Z

+
+The constraints on pod-CIDRs are:
+
+- The host pod-CIDR and every tenant pod-CIDR must be pairwise disjoint.


The disjointness requirement should be extended to include the Service CIDRs of both clusters. Overlaps between a tenant's pod-CIDR and the host's Service CIDR (or vice versa) can cause routing conflicts, making it impossible for pods to reach internal services or the advertised host services.

Suggested change

- The host pod-CIDR and every tenant pod-CIDR must be pairwise disjoint.

- The host pod-CIDR, host service-CIDR, and every tenant pod-CIDR must be pairwise disjoint.

@squat

Adjust the proposal to reflect that the controller will be developed as an independent project under the kilo-io organization, per confirmed interest from Kilo maintainer @squat. Generalize the CRD from a tenant-specific TenantMeshLink to a tenant-agnostic ClusterMesh that references peer clusters through a map of kubeconfig Secrets. Move all tenant semantics into a dedicated Cozystack integration section that also accounts for the kubernetes-nodes split (PR cozystack#8) so a single ClusterMesh covers multi-location, multi-backend tenants. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

…oller allowlist + RBAC monopoly Drop the planned admission webhook. Instead, harden the design with two controls owned by the host-cluster operator: - The controller is the only principal with write access to kilo.squat.ai/Peer in any participating cluster. Tenant-provisioning, the dashboard, and cluster admins can author ClusterMesh objects (intent) but never touch Peer directly. - The controller is configured at deploy time with a subnet allowlist (--allowed-cidr). Any ClusterMesh whose allowedNetworks fall outside that list is rejected with a status condition before any Peer is written. The allowlist cannot be widened through the ClusterMesh API. Collapse the per-cluster podCIDR + advertise fields into a single allowedNetworks list, since both are now validated against the same allowlist and can be expressed uniformly. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

…ontainment Make the WG-IP threat model explicit. A tenant root that tampers with a Node's kilo.squat.ai/wireguard-ip annotation must not be able to inject a Peer with attacker-chosen allowedIPs onto the host side. Add: - A second controller-level allowlist, --allowed-wireguard-cidr, that bounds where any kilo0 interface in the mesh may live. spec.clusters carries no WG-CIDR field; the WG address space is host-admin-owned infrastructure, not part of per-mesh data. - Per-Node validation alongside the existing mesh-level checks: WG-IP must be /32 (or /128), in --allowed-wireguard-cidr, and unique within its cluster. PodCIDRs must be in allowedNetworks. Failures skip the offending Node only; the mesh stays Ready. - A primary-boundary statement in Security: the host's exposure to a tenant peer is bounded exclusively by the host-side Peer.allowedIPs, so anything the tenant does to its own kilo0, routes, or kg-agent post-reconcile cannot widen that bound. - Cozystack integration spelled out for both allowlists: pod-pool to --allowed-cidr, WG-pool to --allowed-wireguard-cidr; tenant provisioning allocates from each. WG-IP is now restored to Peer.allowedIPs (standard Kilo Peer shape), since the new allowlist makes that safe and it brings cross-cluster diagnostics back. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

…Networks list Drop the second --allowed-wireguard-cidr allowlist. WG-CIDR is just another entry in the same allowedNetworks list as pod-CIDR and service-CIDR; per-Node WG-IP containment is validated against the cluster's own allowedNetworks rather than against a separate global pool. A tenant root cannot widen its surface to host pod/WG/service-CIDR because those CIDRs live in the host's allowedNetworks (a different spec.clusters entry), and per-Node containment rejects out-of-range annotations. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

squat · 2026-05-05T20:04:25Z

+
+- Pods in any peer cluster can reach selected services in another peer cluster as if they were on the local network. (Cozystack use case: tenant pods reach host Ceph monitors, OSDs and MDS daemons.)
+- Nodes added to or removed from any participating cluster are wired into / detached from the mesh automatically, without per-node manual configuration.
+- A compromise of a peer cluster (up to and including full root on a peer node) cannot affect routing in another peer cluster beyond the network surface that was explicitly granted, and cannot affect unrelated peers.


unless it is the cluster running the controller, in which case I guess they do get perms on peer clusters, but that's not new

Yes, intentional — that's the design. The controller cluster does have perms on remote Peers via the kubeconfigs it holds, and that's exactly the trust direction we want.

squat · 2026-05-05T20:10:39Z

+      # The controller's own cluster — no kubeconfig needed.
+      local: true
+      allowedNetworks:
+        - 10.4.0.0/16          # WG-CIDR


Should any of these be named fields in the struct rather than open fields in allowedNetworks? If the WireGuard mesh CIDR and the Pod CIDR are mandatory then maybe they get special treatment? Alternatively, using the open list can later be easily migrated into the stricter design

I guess these are all technically optional and it just determines which networks from the Peer resources we want to honor / validate

Kept the flat list for v1alpha1 — it's intentionally narrow ('nothing else than what's listed can end up in a Peer') and easy to migrate to typed fields later if needed. Recorded as Open Question #6 in 8830a7c.

Right, exactly. The list is purely about what's permissible / honoured — not categorising CIDRs into kinds. That's why we kept it flat.

squat · 2026-05-05T20:18:28Z

+
+**Mesh-level (halts reconciliation on failure):**
+
+1. Every CIDR in every `spec.clusters[*].allowedNetworks` is a subset of the controller's `--allowed-cidr` allowlist; otherwise `NetworksNotAllowed=True`.


This is kind of annoying, it means that there is functionally no difference between the cluster admin and the mesh admin. If you want to create a new mesh, then you have to edit the mesh controller deployment to add the allow list. I need to think about this a bit. What are we hoping to defend against here?

Agreed, and dropped. The --allowed-cidr allowlist is gone in 8830a7c. The current design relies on RBAC on ClusterMesh as the address-surface chokepoint, and cluster-side defense-in-depth is now tracked in Alternatives as future Kilo work (your PeerClass suggestion below).

squat · 2026-05-05T20:20:11Z

+  namespace: kilo
+spec:
+  clusters:
+    cluster-a:


Maybe in keeping with Kubernetes convention this should become a list of named structs, like how a Pod contains a list of named containers.

Done in 8830a7c — spec.clusters is now a list of named entries (- name: cluster-a, ...).

squat · 2026-05-05T20:38:03Z

+- **`--allowed-cidr` allowlist** bounds what `spec.clusters[*].allowedNetworks` can ever declare. Pod-CIDRs, WG-CIDRs, and service-CIDRs all flow through the same allowlist. A user who can author `ClusterMesh` objects cannot widen the address surface beyond what the host admin pre-approved.
+- **Per-Node containment** validates that every observed annotation (`Node.Spec.PodCIDRs`, `kilo.squat.ai/wireguard-ip`) lies within the cluster's own `allowedNetworks`. A tenant root forging an annotation that points at the host pod-CIDR, host WG-CIDR, or any other CIDR the tenant did not declare itself is rejected — the offending Node is skipped and never appears as a Peer on the host side.
+- **Trust direction by kubeconfig placement.** Whichever cluster holds the controller and the kubeconfig Secrets is the side that drives writes; the side whose kubeconfig is held cannot write back. In Cozystack, only the host holds tenant kubeconfigs — trust flows host → tenant.
+- **Cross-mesh isolation.** Each `ClusterMesh`'s Peers are labelled with the mesh name; the controller never deletes or modifies Peers belonging to a different mesh, and `allowedNetworks` overlap between meshes (not just within a single mesh) is rejected.


We should probably also add labels for the source cluster name so if two controllers running on different hosts are managing meshes on the same tenant (some triangle where the two hosts don't know about each other) then the controllers are less likely to compete for ownership of Peers if the mesh object has the same name.

Added in 8830a7c — every generated Peer now carries kilo-clustermesh.io/mesh: <mesh-name> and kilo-clustermesh.io/source-cluster: <cluster-name>, with the controller's ownership scoped to its own (mesh, source-cluster) pairs. Two controllers can coexist on the same remote cluster as long as they're not declaring the same source cluster.

squat · 2026-05-05T20:45:35Z

+2. **Cluster identifier scope**: should `spec.clusters` keys be free-form strings or follow a stricter schema (e.g. DNS-1123 labels) so they can be reused as label values? Likely the latter; to confirm during implementation.
+3. **Transitive routing**: with three or more clusters in the same `ClusterMesh`, the controller currently builds a full mesh. Should it support partial topologies (e.g. star)? Out of scope for v1; the CRD shape allows it later.
+4. **Multi-controller scenarios**: in a deployment where two clusters each run their own controller, how should they coordinate? Likely via a "leader" cluster identified in the CRD; deferred.
+5. **Per-peer opt-in for received CIDRs**: today `allowedNetworks` is a unilateral declaration on the source side, plus a global allowlist on the controller. Should there additionally be a per-peer `acceptedNetworks` field, so a peer can refuse to accept some of what another peer publishes? Likely unnecessary given the controller-level allowlist, but worth revisiting once there are multi-tenant deployments with heterogeneous policies.


The more I read about the controller allowlist, the more o actually started leaning in this direction. Maybe this needs to be a flag on Kilo, actually (or an entirely new PeerClass resource that declares what allowed IPs are permissible for every Peer in a cluster). This would allow individual clusters to guard against peers being created by a rogue cluster mesh controller. It's not blocking: this is orthogonal Kilo work that would be great to upstream to improve the administration of Kilo meshes.

Agreed — tracked in Alternatives as future Kilo work in 8830a7c, with credit to you. Independent of this ClusterMesh proposal but a clear next direction for hardening Kilo administration.

kvaps

(withdrawn)

Rename proposal to ClusterMesh, retitle, and add an explicit Scope section. Move the previously committed Cozystack integration into a deferred / exploratory section that lists potential patterns (routed pod-CIDR mesh, NAT egress, Service-mirror via Outline) without committing to any. Adjust rollout phases and tests to remove Cozystack-specific deliverables that belong to follow-up work. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

Arsolitt · 2026-05-11T17:23:20Z

Two related scalability concerns that I think are missing from the proposal.

Aggregate fan-out across tenants

The scaling analysis covers a single ClusterMesh, but on the host side peers from all tenants accumulate on every node:

total_wg_peers_per_host_node = Σ(nodes_in_tenant_i)

Small (10 tenants × 10 nodes) → 100 peers/host node, ~3K Peer CRs in etcd
Medium (50 × 15) → 750 peers/host node, ~22K Peer CRs
Large (200 × 20) → 4,000 peers/host node, ~120K Peer CRs

WireGuard kernel datapath is fine (O(1) AllowedIPs lookup), but the control plane is a different story:

etcd: tens of thousands of Peer objects, each generating watch events
kg-agent: receives ALL Peers, rebuilds wg setconf on every change - at 4K peers that's ~200KB config reparsed per event
controller: reconcile loop across all ClusterMesh CRs × all nodes × all peers

The proposal needs a scalability section with realistic limits and guidance on when to shard tenants across host clusters.

Full bipartite without topology awareness

Every node in cluster A gets a tunnel to every node in cluster B. In heterogeneous clusters only a subset of nodes run the target workload - tunnels to irrelevant nodes carry zero useful traffic.

A nodeSelector on the cluster spec would scope the mesh to relevant nodes:

spec:
  clusters:
    - name: host
      local: true
      allowedNetworks: [...]
      nodeSelector:
        storage-role: ceph

If 10 of 30 host nodes are selected, per-tenant peer count drops by 3x. Across 50 tenants: 750 → 250 peers per host node.

Combined effect

These multiply: topology awareness reduces per-mesh fan-out, aggregate analysis tells you how many meshes you can sustain. Without both, there's no way to give operators capacity planning guidance.

@squat

- spec.clusters: change from map keyed by cluster name to a list of named structs, in line with Kubernetes convention. - Drop the in-controller --allowed-cidr allowlist. It functionally collapsed cluster-admin and mesh-admin roles and overlapped with RBAC on ClusterMesh + per-Node containment. Defense story now relies on RBAC at the ClusterMesh creation point. - Add kilo-clustermesh.io/source-cluster label to every generated Peer, enabling co-existence of multiple controllers writing into the same remote cluster without ownership conflicts. - Add an Alternatives entry for a future Kilo-side defense-in-depth mechanism (cluster-level allowlist or PeerClass CRD), as suggested by @squat during review. - Move the flat-vs-typed-fields shape of allowedNetworks to Open Questions, noting the current flat list is intentionally narrow and easy to migrate later. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

Generic cluster-to-cluster mesh controller for Kilo that connects 2+ Kubernetes clusters into a flat node-to-node WireGuard mesh using Kilo's mesh-granularity=cross topology. Implements the design proposal from cozystack/community#7. Key components: - ClusterMesh CRD with named CIDR fields (podCIDRs, wireguardCIDR, serviceCIDR, additionalCIDRs) for precise validation - Two-layer validation: mesh-level CIDR overlap detection and per-node annotation validation (PodCIDR containment, WG-IP validity) - Multi-cluster client via controller-runtime pkg/cluster with graceful manager restart on cluster set changes - Peer builder creating per-node and anchor Peer CRDs with label-based ownership isolation between meshes - CRD self-install via embedded YAML applied on startup - Finalizer-based cleanup on ClusterMesh deletion - Helm chart with 38 helm-unittest assertions - Integration tests using dual envtest API servers - CI/CD with GitHub Actions (lint, test, integration, helm, release) Signed-off-by: Arsolitt <arsolitt@gmail.com>

Add docs/known-gaps.md tracking outstanding work, divergences from the upstream proposal (cozystack/community#7), and settled design decisions that should not be re-litigated. Link from README documentation table and project-status note. Captures one blocker gap (no Node watches), one operational risk (silent anchor-peer suppression), and three proposal text corrections (annotation name, prefix rule, flat-vs-typed CRD shape). Signed-off-by: Arsolitt <arsolitt@gmail.com>

gemini-code-assist Bot reviewed May 4, 2026

View reviewed changes

kvaps and others added 4 commits May 5, 2026 19:24

squat reviewed May 5, 2026

View reviewed changes

kvaps commented May 11, 2026

View reviewed changes

Arsolitt mentioned this pull request May 20, 2026

feat: initial Kilo cluster-mesh operator POC cozystack/kilo-clustermesh-operator#2

Open

7 tasks


		### Topology

		Both the host cluster and every participating tenant cluster run Kilo with `--mesh-granularity=cross`. In this mode every node is a topology segment of one. Within a single logical location (e.g. all nodes inside one cluster) traffic uses the underlying CNI without WireGuard. Across logical locations every node holds a direct WireGuard tunnel to every node in the other location.


		Both the host cluster and every participating tenant cluster run Kilo with `--mesh-granularity=cross`. In this mode every node is a topology segment of one. Within a single logical location (e.g. all nodes inside one cluster) traffic uses the underlying CNI without WireGuard. Across logical locations every node holds a direct WireGuard tunnel to every node in the other location.

		For the host ↔ tenant pair, the result is a full bipartite mesh: every tenant node has a tunnel to every host node, and vice versa. The number of tunnels is `N × M` where N is the tenant node count and M is the host node count; this is intentional and is what enables the throughput and HA properties described below.

	2. Lists host cluster Nodes; for each node, ensures a `Peer` exists in the tenant cluster with: `publicKey` from the `kilo.squat.ai/wireguard-public-key` annotation, `endpoint` from `kilo.squat.ai/force-endpoint`, and `allowedIPs` containing the node's per-node pod-CIDR.
	2. Lists host cluster Nodes; for each node, ensures a Peer exists in the tenant cluster with: publicKey from the kilo.squat.ai/wireguard-public-key annotation, endpoint from kilo.squat.ai/force-endpoint (falling back to Node IP if missing), and allowedIPs containing the node's per-node pod-CIDR.

	4. Removes orphaned Peer objects on either side using a label selector tied to the `TenantMeshLink` name.
	4. Uses finalizers to ensure all remote Peer objects are removed from the tenant cluster before the TenantMeshLink is deleted.\n5. Removes orphaned Peer objects on either side using a label selector tied to the TenantMeshLink name.


		The constraints on pod-CIDRs are:

		- The host pod-CIDR and every tenant pod-CIDR must be pairwise disjoint.

	- The host pod-CIDR and every tenant pod-CIDR must be pairwise disjoint.
	- The host pod-CIDR, host service-CIDR, and every tenant pod-CIDR must be pairwise disjoint.


		Mesh-level (halts reconciliation on failure):

		1. Every CIDR in every `spec.clusters[*].allowedNetworks` is a subset of the controller's `--allowed-cidr` allowlist; otherwise `NetworksNotAllowed=True`.

Conversation

kvaps commented May 4, 2026

Summary

Test plan

Uh oh!

coderabbitai Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kvaps left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Arsolitt commented May 11, 2026

Aggregate fan-out across tenants

Full bipartite without topology awareness

Combined effect

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

coderabbitai Bot commented May 4, 2026 •

edited

Loading

kvaps left a comment •

edited

Loading