Multi-region: one control plane, many datacenters (region → zone model, agent-based transport) #157
Replies: 4 comments
-
Addendum: agent scope — generic manifest application and the per-service operatorsA scope question came up: since the agent is deployed onto each Harvester at bootstrap, could it also apply Kubernetes manifests on command — and eventually replace the separate per-service operators (key vault, database, …) running on every Harvester? Decision after going through the operator code: 1. Generic manifest primitives: yes, in protocol v1 — they're essentially free. The agent must hold a local Kubernetes client regardless, because every provider operation it executes is a CR manipulation (KubeVirt VMs, KubeOVN VPCs/Subnets, 2. Operator delivery through the agent: yes, near-term win. Today each operator (and its CRDs) is delivered to each Harvester out-of-band. With manifest primitives, the control plane can ship and upgrade the operators through the agent at bootstrap — one delivery channel per datacenter, no per-operator installation plumbing, and operator versions become control-plane-managed facts. 3. Replacing the operators with the agent: deferred, deliberately. "Apply manifests" is not what the operators are for. Reading the key-vault controller (~1,500 lines): it watches CRs in a continuous reconcile loop with requeues, discovers the Raft leader pod among the vault replicas, drives the vault's own API (mounts, policies, AppRoles), handles unseal flows, and runs finalizer cleanup that connects into the service before letting CRs go. The database operator has the same shape per the managed-services framework contract. Replacing them means the agent would have to host controller loops (an embedded controller-runtime manager with per-service reconciler modules) — coherent as a future architecture ("one agent, many controllers" would collapse N operator Deployments per Harvester into one binary), but it couples agent releases to every service's controller code, unions their RBAC into one identity, and makes the agent the largest possible blast radius. That trade is not worth taking while the operator count is small. Revisit trigger for #3: when the per-Harvester operator count or their upgrade orchestration becomes a real operational burden, the migration path is incremental — the agent already speaks the manifest/status/watch primitives, so individual operators can be absorbed as embedded reconciler modules one service at a time without protocol changes. |
Beta Was this translation helpful? Give feedback.
-
Milestone breakdown: from rails (foundation) to agent-executed provisioningThe foundation PR (#160) lands the rails — the region/zone model, the outbound WebSocket channel, agent tokens, and derived health. It is dark and non-breaking: an agent can connect and show up as healthy, but nothing routes provisioning through it yet, and dc-api still calls Harvester/Rancher directly. This comment maps the path from there to the actual goal. Design principle: every cluster is a region — uniformlyThere is no privileged "home" region and no direct-to-Harvester path anywhere. The Harvester cluster the control plane happens to run on is reached exactly like every other one: through that cluster's agent, dialing out to the control plane. dc-api never holds a Harvester kubeconfig for any region — not even the one it sits on. The payoff is that the control plane becomes location-independent. Its only hard runtime dependency is that agents can reach its WSS endpoint over 443. Move it to a different Harvester, a different cloud, or a workstation, and nothing changes for the regions — the agents simply reconnect to the new endpoint. One code path for one and for a hundred clusters. End-state credential modeldc-api stops holding cluster credentials entirely. In their place:
Blast radius shrinks from compromise the control plane → every region's clusters to compromise the control plane → the channel tokens (an attacker could send malicious intent, which is bounded and auditable, but does not get the raw cluster keys). And because every agent dials out, no region exposes an inbound API. MilestonesM-A — Protocol v1: operation verbs. Extend the v0 envelope (same JSON-over-WS framing, forward-compatible) with M-B — Agent executor. Give the agent an in-cluster client that applies intent idempotently against the local cluster, reconciles, and reports status back over the channel. Defines the RBAC-scoped ServiceAccount and the reconcile/retry semantics. The manifest primitive here is also what lets us deliver per-service operators (DB, KV, …) through the agent instead of running a separate operator install per cluster. M-C — Provider routing in dc-api. Implement the existing compute/network provider interfaces as a channel-backed provider selected per region/zone via the provider registry — it serializes intent to the right agent instead of calling Harvester directly. This replaces the direct driver for all regions (no special case). Status flows back into the M-D — Credential isolation + control-plane portability. Remove Harvester kubeconfigs from dc-api entirely; agents run with in-cluster SAs. Document the bootstrap path (how the first agent and the control plane itself come up before any channel exists) and a "move the control plane" runbook. Open questions to resolve as we go
|
Beta Was this translation helpful? Give feedback.
-
Foundation is live — and an agent has now connected to the deployed control plane, end to endStatus update on the rails from the milestone breakdown above: #160 is merged and deployed, and as of today the loop is proven for real. A The honest, no-magic version of how it was wired: the agent reached dc-api through a Agent log (laptop Region card in cloud-ui reading Supporting PRs now open
Next: M-A — turn the heartbeat into a command channel (protocol v1)The channel today speaks four frames (
Open questions I'll settle in the design pass before coding: delivery semantics (at-least-once + the agent-side dedupe window); how Starting the M-A design now — I'll post the protocol sketch here for review before implementation. |
Beta Was this translation helpful? Give feedback.
-
M-A design sketch — the frame envelope and the regions visibility modelTwo design points are now concrete (the full design lands as 1. Frame envelope — one generic request/response, not a type per verb. Rather than a distinct frame req → { "type":"req", "id":"<uuid>", "op":"get_inventory", "params":{…} }
res (ok) → { "type":"res", "id":"<uuid>", "ok":true, "result":{…} }
res (err) → { "type":"res", "id":"<uuid>", "ok":false, "error":{ "code":"…", "message":"…" } }
progress → { "type":"progress", "id":"<uuid>", "stage":"…", "detail":"…" } // advisory, 0+The agent advertises the ops it can serve on 2. Regions visibility — admin vs. tenant. Node counts and capacity (from Sequencing note worth flagging: today provisioning still runs direct-to-Harvester, so a region's live status means "an agent is connected", not "this region can provision". The live-agent placement gate therefore switches on with M-C (when the agent becomes the provisioning path); until then placeability rides an admin "region enabled" flag, so nothing that works today breaks. In short: Proceeding to build the RPC machinery now — server |
Beta Was this translation helpful? Give feedback.


Uh oh!
There was an error while loading. Please reload this page.
-
Problem
dc-api today manages exactly one site: one Harvester cluster, one Rancher server, one KubeOVN fabric, configured by a single set of
DCAPI_*env vars. A second datacenter is coming online, and the product goal is one control plane that deploys and manages resources across many datacenters — the way a public cloud exposes regions — without tenants ever caring where dc-api itself runs.Two hard requirements shape the design:
region → zone, where a zone is one Harvester (+ its Rancher); naming and schema should assume this even while every region has exactly one zone.Proposed model
Region/zone as first-class API objects.
GET /v1/regionslists regions and their zones with health. Every regional resource (VNet, VM, cluster, bastion, volume) carries an immutableregion(and eventuallyzone) attribute. Tenants and projects stay region-agnostic.Containment does the validation. A VM's region derives from its VNet; a cluster's from its VNet; a bastion's from its VNet. Creating a VM in region B against a VNet in region A is a 422 with a clear message — no special-casing, the parent resource's region is simply authoritative. Only root resources (VNets, key vaults) take region as a free choice. VNet peering stays same-region until a cross-region fabric exists.
Provider registry. The Strategy/Factory layer already isolates handlers from backends. The singleton provider set becomes a registry keyed by region/zone:
providers.For(region, zone)returns the same Compute/Cluster/Network interfaces. Handlers resolve the registry through the resource's region; the reconciler runs per-region loops so one region's outage can't stall another's reconciliation.Transport: regional agents over outbound WSS (443)
Rather than routing the control plane into each datacenter's management network (site-to-site tunnels, O(n²) growth, inbound firewall holes), each region runs a small dc-agent that dials out to the control plane over WSS on 443 — internet-traversable, TLS end to end, nothing inbound to any datacenter. This is the same topology Rancher's cattle-agent, Azure Arc, and GitHub runners use, and it is the easier story to defend with security teams: each datacenter only ever opens an outbound HTTPS connection.
GET /v1/regionsand the dashboard.Region registration & credentials
Admin-driven and API-managed:
POST /v1/admin/regionscreates the region and mints a one-time agent bootstrap token; the agent is deployed in the datacenter with that token, dials in, completes a token exchange, and receives its long-lived identity (mTLS cert or rotating token). Decommissioning = revoke the agent identity. No region credentials are ever uploaded to the control plane.Quotas & placement
Per-region tenant caps and project quotas (a region's capacity is physically its own), mirroring how public clouds scope quota by region. Project presence in a region materializes lazily — the per-project namespace/quota mirror is provisioned in a region the first time the project deploys there, not eagerly in every region at project creation.
Other consequences
Phasing
region/zonecolumns on regional tables (defaulted to the current site), read-only in API/UI. Cheap; protects against painful retrofits.Open questions
GET /v1/regionshealth is agent-liveness vs deeper probes (Harvester API reachability, capacity headroom)?Feedback welcome — especially on the agent protocol choice (q1) and phase-1-without-agents (q4), which gate the implementation order.
Beta Was this translation helpful? Give feedback.
All reactions