This document covers the fundamental concepts, controllers and objects that underpin cluster creation in Tanzu Kubernetes Grid (TKG) Service.
The User Documentation on how to provision a TanzuKubernetesCluster is here
A quick glossary before we start:
- TKG Service is a feature of vSphere with Kubernetes that stands up virtualized "Guest Clusters" using ClusterAPI
- Guest Cluster is an upstream virtualized Kubernetes Cluster that is created and managed declaratively using a TKC
- Supervisor Cluster refers to the Kubernetes API built into vSphere that uses ESXi hosts as nodes
- Project Pacific is the codename of the Project that was launched as vSphere 7.0 with Kubernetes
- WCP stands for Workload Control Plane and was the internal name for the project before Pacific
Please note that examples were correct at the time of publishing (April 2020)
The TKG Service capability is heavily based around the Kubernetes ClusterAPI project, which is designed to allow you to declaratively define a K8S cluster in the same way as you would a Pod or Deployment. Of course this presents something of a chicken and egg problem - to stand up your K8S cluster, you need a K8S cluster to bootstrap it from. In Project Pacific, the built-in K8S cluster called the Supervisor Cluster serves this purpose.
As such, if you want to troubleshoot TKG Service Lifecycle issues, you'll spend most of the time actually interacting with the Supervisor. We'll touch on aspects of the Guest Cluster itself when we look at node debugging, security etc.
At its most basic, cluster lifecycle is managed by 3 layers: The Tanzu Kubernetes Grid (TKG) layer, the ClusterAPI (CAPI) layer and the VM Operator layer.
ClusterAPI as the middle tier is the foundation around which everything else functions. The VM Operator layer supplies the plumbing and the TKG layer provides decoration and specialization. Spec is reconciled down the layers and Status is published back up the layers.
The user starts by creating a TanzuKubernetesCluster (TKC) definition, which is reconciled by the TKG controller. TKG takes that TKC and turns it into CAPI / CAPW objects (CAPW is the ClusterAPI provider for Project Pacific). It also reads CAPI / CAPW status and reflects that back in TKC
The ClusterAPI layer is made up of 3 controllers and we'll look in more detail at what they do below. Fundamentally it's responsible for creating VirtualMachine definitions which contain the necessary configuration to stand up a cluster. Just like TKG, it also monitors the status of the VirtualMachine objects and reflects that back in the CAPI / CAPW objects.
The VM Operator layer is the declarative interface to vSphere. It creates objects such as VirtualMachineImage that show what images are available and reconciles VirtualMachine objects into actual VMs. It monitors the status of the real vSphere infrastructure and ensures that is reflected into the objects it manages.
As with everything, the reality is much more involved, although it follows the basic principle laid out above.
As you can see at the bottom of this diagram, the TanzuKubernetesCluster is reconciled by the TKG controller.
Note that each controller typically has 3 instances running with leader election enabled, so when we say "the controller" for any of these, we mean whichever one of 3 controllers that is currently acting as leader.
TKG has to generate all of the input for the ClusterAPI layer. This is currently logically grouped as 3 YAML files.
- Cluster.yaml contains the definition of the Cluster itself. Note that it's split into two objects - the CAPI Cluster and the WCPCluster that augments it.
- Controlplane.yaml contains the definition of the cluster control plane components. This consists of the following:
- A CAPI Machine. This is the baseline definition of a control plane node, containing spec and status common to all implementations
- A WCPMachine. This is the specialization of a control plane node from the vSphere standpoint. This is where vSphere value add and status for a node can be defined.
- A KubeadmConfig. We use kubeadm to stand up the cluster. KubeadmConfig provides the means to define in abstract terms what we want from kubeadm.
- Note that the Control Plane will have one of each of these objects for each control plane node
- MachineDeployment.yaml contains the definition of the worker components. This uses a distinct approach from the control plane.
- A MachineDeployment is a Kubernetes Deployment object that references a MachineSet.
- The MachineSet attempts to maintain a configured set of replica Machine and WCPMachines that it stamps out from templates
- WCPMachineTemplate is the template for each worker WCPMachine. This is what MachineSet uses as its template for creating WCPMachines
- KubeadmConfigTemplate is the template from which KubeadmConfig instances are created. One per node.
- A MachineDeployment is a Kubernetes Deployment object that references a MachineSet.
Viewing these objects:
The raw YAML that's created only actually exists in memory in the controllers. It is never persisted to disk, so the only way to examine the objects it contains is via kubectl:
Use a kubectl client authenticated with the Supervisor cluster
This example shows a cluster with 3 control plane nodes and 3 worker nodes:
$ kubectl get clusters,wcpclusters,machines,wcpmachines,kubeadmconfigs,machinedeployments,machinesets,wcpmachinetemplates,kubeadmconfigtemplates -A
NAMESPACE NAME PHASE
ben-test cluster.cluster.x-k8s.io/test-cluster provisioned
NAMESPACE NAME AGE
ben-test wcpcluster.infrastructure.cluster.vmware.com/test-cluster 6h49m
NAMESPACE NAME PROVIDERID PHASE
ben-test machine.cluster.x-k8s.io/test-cluster-control-plane-jwsx7 vsphere://423697d9-706c-3e92-7514-088978955321 running
ben-test machine.cluster.x-k8s.io/test-cluster-control-plane-zbmm5 vsphere://4236b40f-b40e-48d6-aa12-0d6b9fe0b5fe running
ben-test machine.cluster.x-k8s.io/test-cluster-control-plane-zdxfr vsphere://4236b734-7e37-33d6-5e26-066d3553d419 running
ben-test machine.cluster.x-k8s.io/test-cluster-workers-z7j8q-6957cdb9bb-7666j vsphere://4236cfc4-29f8-78a3-7fc4-8e1c9ccc4e61 running
ben-test machine.cluster.x-k8s.io/test-cluster-workers-z7j8q-6957cdb9bb-kd4mw vsphere://4236494b-a76b-9e1d-d044-4dfe42a51d3a running
ben-test machine.cluster.x-k8s.io/test-cluster-workers-z7j8q-6957cdb9bb-vj2jm vsphere://42369d95-baa6-eb91-c570-4818c0157e64 running
NAMESPACE NAME PROVIDERID IPADDR
ben-test wcpmachine.infrastructure.cluster.vmware.com/test-cluster-control-plane-jwsx7 vsphere://423697d9-706c-3e92-7514-088978955321 172.26.1.18
ben-test wcpmachine.infrastructure.cluster.vmware.com/test-cluster-control-plane-zbmm5 vsphere://4236b40f-b40e-48d6-aa12-0d6b9fe0b5fe 172.26.1.21
ben-test wcpmachine.infrastructure.cluster.vmware.com/test-cluster-control-plane-zdxfr vsphere://4236b734-7e37-33d6-5e26-066d3553d419 172.26.1.19
ben-test wcpmachine.infrastructure.cluster.vmware.com/test-cluster-workers-hktqw-h97cn vsphere://4236cfc4-29f8-78a3-7fc4-8e1c9ccc4e61 172.26.1.20
ben-test wcpmachine.infrastructure.cluster.vmware.com/test-cluster-workers-hktqw-tkwrw vsphere://4236494b-a76b-9e1d-d044-4dfe42a51d3a 172.26.1.23
ben-test wcpmachine.infrastructure.cluster.vmware.com/test-cluster-workers-hktqw-v6wss vsphere://42369d95-baa6-eb91-c570-4818c0157e64 172.26.1.22
NAMESPACE NAME AGE
ben-test kubeadmconfig.bootstrap.cluster.x-k8s.io/test-cluster-control-plane-jwsx7 6h49m
ben-test kubeadmconfig.bootstrap.cluster.x-k8s.io/test-cluster-control-plane-zbmm5 6h49m
ben-test kubeadmconfig.bootstrap.cluster.x-k8s.io/test-cluster-control-plane-zdxfr 6h49m
ben-test kubeadmconfig.bootstrap.cluster.x-k8s.io/test-cluster-workers-mg88z-99jsn 6h49m
ben-test kubeadmconfig.bootstrap.cluster.x-k8s.io/test-cluster-workers-mg88z-9mccr 6h49m
ben-test kubeadmconfig.bootstrap.cluster.x-k8s.io/test-cluster-workers-mg88z-g7vph 6h49m
NAMESPACE NAME AGE
ben-test machinedeployment.cluster.x-k8s.io/test-cluster-workers-z7j8q 6h49m
NAMESPACE NAME AGE
ben-test machineset.cluster.x-k8s.io/test-cluster-workers-z7j8q-6957cdb9bb 6h49m
NAMESPACE NAME AGE
ben-test wcpmachinetemplate.infrastructure.cluster.vmware.com/test-cluster-workers-hktqw 6h49m
NAMESPACE NAME AGE
ben-test kubeadmconfigtemplate.bootstrap.cluster.x-k8s.io/test-cluster-workers-mg88z 6h49m
The TanzuKubernetesCluster (TKC) is both the Specification for the cluster and shows detailed status of how the cluster lifecycle is progressing. We will look in more detail at TKC in another document.
In abstract terms, the objects generated above provide a blueprint for the CAPI / CAPW layer to stand up the Kubernetes Cluster. The interaction of the three controllers (CAPI, CAPW and CABPK) is a little involved, but the diagram above illustrates it well.
The CAPBK controller is the simplest, so this is a good place to start. Its job is to reconcile each KubeadmConfig object and apply it to some kind of node configuration protocol. In this particular instance, our configuration protocol is cloud-init, which is a universal means of configuring VMs. So the CAPBK controller's job is to create cloud-init config for each node.
It writes this config in base64 encoded format into the Status of each KubeadmConfig object:
$ kubectl describe kubeadmconfig.bootstrap.cluster.x-k8s.io/test-cluster-control-plane-jwsx7
Name: test-cluster-control-plane-jwsx7
Namespace: ben-test
Labels: cluster.x-k8s.io/cluster-name=test-cluster
cluster.x-k8s.io/control-plane=true
Annotations: <none>
API Version: bootstrap.cluster.x-k8s.io/v1alpha2
Kind: KubeadmConfig
Metadata:
...
Spec:
Cluster Configuration:
API Server:
...
Extra Volumes:
...
Certificates Dir: /etc/kubernetes/pki
Cluster Name: test-cluster
Control Plane Endpoint: {% if ds.meta_data.controlPlaneEndpoint %}{{ ds.meta_data.controlPlaneEndpoint }}{% else %}{{ ds.meta_data.local_ipv4 }}:6443{% endif %}
Controller Manager:
...
Dns:
...
Etcd:
...
Image Repository: vmware.io
Kubernetes Version: 1.17.4+vmware.1
Networking:
...
Scheduler:
...
Files:
...
Init Configuration:
...
Post Kubeadm Commands:
...
Pre Kubeadm Commands:
...
Users:
...
Status:
Bootstrap Data: IyMgdGVtcGxhdGU6IGppbmphCiNjbG91ZC1jb25maWcKCndya
$ cat <boostrap-data> | base64 --decode
##template: jinja
#cloud-config
write_files:
- path: /etc/kubernetes/pki/ca.crt
owner: root:root
permissions: '0640'
content: |
-----BEGIN CERTIFICATE-----
MIICyzCCAbOgAwIBAgIBADANBgkqhkiG9w0BAQsFADAVMRMwEQYDVQQDEwprdWJl
...
A few things to note here:
- As you can see from the above, the config provided to KubeadmConfig is extensive, as is the cloud-init configuration.
- Note the presence of certificates in the cloud-init data. As the diagram shows, these are also written out by the CABPK controller
- Finally, note the use of jinja templating. This is done so that some parameters of the cloud-init script (such as IP address) can be late-bound on the node instance itself.
Once we look at node debugging, you'll see this exact same cloud-init config being processed by the various initialization stages on the node.
The CAPI controller is a little like the conductor of an orchestra. It reconciles everything except for the WCP-specific objects and it tries to ensure that each object has what it needs. It also monitors the health of the various components, ensuring that the results are reflected in the Machine and Cluster status.
Note that TKG Service currently uses version v1alpha2 of ClusterAPI
In order to better understand what CAPI controller does, let's have a look at an example of a Cluster and a Machine object:
$ kubectl describe cluster test-cluster
Name: test-cluster
Namespace: ben-test
Labels: <none>
Annotations: <none>
API Version: cluster.x-k8s.io/v1alpha2
Kind: Cluster
Metadata:
Creation Timestamp: 2020-04-16T17:06:53Z
Finalizers:
cluster.cluster.x-k8s.io
Generation: 1
Owner References:
API Version: run.tanzu.vmware.com/v1alpha1
Block Owner Deletion: true
Controller: true
Kind: TanzuKubernetesCluster
Name: test-cluster
UID: 5e434bbe-7b24-4fbd-81c0-aecc72326002
Resource Version: 436870
Self Link: /apis/cluster.x-k8s.io/v1alpha2/namespaces/ben-test/clusters/test-cluster
UID: e644bb12-2a08-4d27-be7f-88e460041d2d
Spec:
Cluster Network:
Pods:
Cidr Blocks:
192.0.2.0/16
Service Domain: tanzukubernetescluster.local
Services:
Cidr Blocks:
198.51.100.0/12
Infrastructure Ref:
API Version: infrastructure.cluster.vmware.com/v1alpha2
Kind: WCPCluster
Name: test-cluster
Namespace: ben-test
Status:
API Endpoints:
Host: 192.168.123.3
Port: 6443
Control Plane Initialized: true
Infrastructure Ready: true
Phase: provisioned
Events: <none>
</code></pre>
</details>
</p>
There are few interesting things worth calling out here:
- Note the finalizer -
cluster.cluster.x-k8s.io
. This will prevent the Cluster from being deleted until it's explicitly removed. It's how ClusterAPI ensures that things get deleted in a sane order - Note the owner reference to TanzuKubernetesCluster
test-cluster
. All the objects are in a dependency graph of owner references. These service two important purposes:- They make it easy to define queries in controller-runtime like, "reconcile this object if any of its children change". That only works for the object's immediate descendents
- They also ensure that deletion cascades down through the owner reference hierarchy. If TKC is deleted, all of its children are also deleted
- You'll see this is where the Cluster Network settings are specified
- The infrastructure reference to the WCPCluster is how the two are connected. WCPCluster also has an owner reference back to the Cluster
APIEndpoints
is typically just one value - the IP address for connecting to the API Server. It can in theory be multiple values, but only if there isn't a load balancer- The rest of the status is just to indicate where the Cluster is in terms of its progress. That is all summarized in TKC by TKG controller.
Now let's look at a Machine:
$ kubectl describe machine test-cluster-control-plane-jwsx7
Name: test-cluster-control-plane-jwsx7
Namespace: ben-test
Labels: cluster.x-k8s.io/cluster-name=test-cluster
cluster.x-k8s.io/control-plane=true
run.tanzu.vmware.com/control-plane-init=true
Annotations: <none>
API Version: cluster.x-k8s.io/v1alpha2
Kind: Machine
Metadata:
Creation Timestamp: 2020-04-16T17:06:53Z
Finalizers:
machine.cluster.x-k8s.io
Generation: 3
Owner References:
API Version: run.tanzu.vmware.com/v1alpha1
Block Owner Deletion: true
Controller: true
Kind: TanzuKubernetesCluster
Name: test-cluster
UID: 5e434bbe-7b24-4fbd-81c0-aecc72326002
API Version: cluster.x-k8s.io/v1alpha2
Kind: Cluster
Name: test-cluster
UID: e644bb12-2a08-4d27-be7f-88e460041d2d
Resource Version: 436867
Self Link: /apis/cluster.x-k8s.io/v1alpha2/namespaces/ben-test/machines/test-cluster-control-plane-jwsx7
UID: c0666054-570b-467f-8a20-b719d267276b
Spec:
Bootstrap:
Config Ref:
API Version: bootstrap.cluster.x-k8s.io/v1alpha2
Kind: KubeadmConfig
Name: test-cluster-control-plane-jwsx7
Namespace: ben-test
Data: IyMgdGVtcGxhdGU6IGppbmphCiNjbG91ZC1jb25maWcKCndyaXRlX2ZpbGVzOgotICAgcGF0aDogL2V0Yy9rd...
...
Infrastructure Ref:
API Version: infrastructure.cluster.vmware.com/v1alpha2
Kind: WCPMachine
Name: test-cluster-control-plane-jwsx7
Namespace: ben-test
Metadata:
Provider ID: vsphere://423697d9-706c-3e92-7514-088978955321
Version: 1.17.4+vmware.1
Status:
Bootstrap Ready: true
Infrastructure Ready: true
Node Ref:
Name: test-cluster-control-plane-jwsx7
UID: 9f03b370-bb53-4fdd-abcb-0a894a885be0
Phase: running
Events: <none>
As you might expect, the Machine Spec
defines the blueprint for a node and the Status
is the current status of the node
- Note the labels. They're all quite self-explanatory, but they're all meaningful for CAPI and should not be changed
- The Machine has a finalizer just like Cluster to prevent it being deleted until the CAPI controller allows it
- Note that this Machine has two owner references - one to the TKC and one to the Cluster object. The first is created by TKG and the latter is created by CAPI.
- The Bootstrap Data is a copy of the base64-encoded cloud-init kubeadm config from KubeadmConfig Status
- Infrastructure reference is a reference to the corresponding WCPMachine for this Machine. The WCPMachine has an owner reference back to the Machine
- The
ProviderID
is the biosUUID of the VM - The
Version
is the version of K8S that the node will run - The
Status
is fairly self-explanatoryBootstrap Ready
andInfrastructure Ready
is a measure of whether the Bootstrap Data has been populated and whether the WCPMachine is available, respectively- The
Node Ref
is data gathered from the API server of the new cluster Phase isrunning
once the control plane node has successfully initialized. Valid options arepending
,running
,terminating
orfailed
.
While the CAPI and CABPK controllers are all upstream shared code, the CAPW controller is unique to TKG Service. It's a relatively small controller that delegates all of the heavy lifting to VM Operator.
The primary purpose of CAPW is to reconcile Cluster, WCPCluster, Machine and WCPMachine objects and maintain VirtualMachine objects as a way of driving VM Operator.
An important subcomponent of CAPW is the VirtualNetwork which is a handshake between CAPW and the network provider, in this case NSX. See:
$ kubectl describe VirtualNetworks -A
Name: test-cluster-vnet
Namespace: ben-test
Labels: <none>
Annotations: ncp/extpoolid: domain-c9:b7acdc46-de1f-4613-b3ea-2dd2e52855cc-ippool-192-168-124-1-192-168-124-254
ncp/snat_ip: 192.168.124.16
ncp/subnet-0: 172.26.1.16/28
API Version: vmware.com/v1alpha1
Kind: VirtualNetwork
Metadata:
Creation Timestamp: 2020-04-16T17:06:53Z
Generation: 2
Owner References:
API Version: infrastructure.cluster.vmware.com/v1alpha2
Kind: WCPCluster
Name: test-cluster
UID: 9642d4c8-a024-4f6e-ac08-492c9848b19b
Resource Version: 433547
Self Link: /apis/vmware.com/v1alpha1/namespaces/ben-test/virtualnetworks/test-cluster-vnet
UID: 0c909654-a822-4831-b6dd-3cb10e25631a
Spec:
Status:
Conditions:
Status: True
Type: Ready
Default SNATIP: 192.168.124.16
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRealizeNSXResource 15m nsx-container-ncp, 42367cfa2d665d57f3fd28b1131a3374 Successfully realized NSX resource for VirtualNetwork
The VirtualNetwork object is created by CAPW and populated by the NSX control plane. CAPW then copies a reference to the VirtualNetwork into the VirtualMachine when it's created.
Let's have a look at WCPCluster and WCPMachine
$ kubectl describe WCPCluster test-cluster
Name: test-cluster
Namespace: ben-test
Labels: <none>
Annotations: <none>
API Version: infrastructure.cluster.vmware.com/v1alpha2
Kind: WCPCluster
Metadata:
Creation Timestamp: 2020-04-16T17:06:53Z
Finalizers:
wcpcluster.infrastructure.cluster.vmware.com
Generation: 1
Owner References:
API Version: run.tanzu.vmware.com/v1alpha1
Block Owner Deletion: true
Controller: true
Kind: TanzuKubernetesCluster
Name: test-cluster
UID: 5e434bbe-7b24-4fbd-81c0-aecc72326002
API Version: cluster.x-k8s.io/v1alpha2
Kind: Cluster
Name: test-cluster
UID: e644bb12-2a08-4d27-be7f-88e460041d2d
Resource Version: 433592
Self Link: /apis/infrastructure.cluster.vmware.com/v1alpha2/namespaces/ben-test/wcpclusters/test-cluster
UID: 9642d4c8-a024-4f6e-ac08-492c9848b19b
Spec:
Status:
API Endpoints:
Host: 192.168.123.3
Port: 6443
Ready: true
Resource Policy Name: test-cluster
Events: <none>
There's nothing much of interest here. In fact, there's nothing in the Spec at all to augment WCPCluster.
The one reference of interest here is the ResourcePolicy
name. This is a reference to a VirtualMachineSetResourcePolicy.
Let's look at one:
$ kubectl describe VirtualMachineSetResourcePolicy -A
Name: test-cluster
Namespace: ben-test
Labels: <none>
Annotations: <none>
API Version: vmoperator.vmware.com/v1alpha1
Kind: VirtualMachineSetResourcePolicy
Metadata:
Creation Timestamp: 2020-04-16T17:06:53Z
Finalizers:
virtualmachinesetresourcepolicy.vmoperator.vmware.com
Generation: 1
Owner References:
API Version: infrastructure.cluster.vmware.com/v1alpha2
Kind: WCPCluster
Name: test-cluster
UID: 9642d4c8-a024-4f6e-ac08-492c9848b19b
Resource Version: 433473
Self Link: /apis/vmoperator.vmware.com/v1alpha1/namespaces/ben-test/virtualmachinesetresourcepolicies/test-cluster
UID: b62f499f-c578-4aae-8cc3-ee21a150c624
Spec:
Clustermodules:
Groupname: control-plane-group
Groupname: test-cluster-workers-0
Folder:
Name: test-cluster
Resourcepool:
Limits:
Cpu: 0
Memory: 0
Name: test-cluster
Reservations:
Cpu: 0
Memory: 0
Status:
Clustermodules:
Groupname: control-plane-group
Module UUID: 52f2e164-5c78-b73e-7045-11a515f617d6
Groupname: test-cluster-workers-0
Module UUID: 5203b2e6-3cf5-4c86-fb0b-7324b25dac69
Events: <none>
In the same way that CAPW creates VirtualMachines, it also creates this VirtualMachineSetResourcePolicy object which then gets populated by VM Operator. Clustermodules
maps directly to vSphere Cluster Modules.
The WCPMachine is as you might expect:
$ kubectl describe WCPMachine test-cluster-control-plane-jwsx7
Name: test-cluster-control-plane-jwsx7
Namespace: ben-test
Labels: cluster.x-k8s.io/cluster-name=test-cluster
cluster.x-k8s.io/control-plane=true
Annotations: <none>
API Version: infrastructure.cluster.vmware.com/v1alpha2
Kind: WCPMachine
Metadata:
Creation Timestamp: 2020-04-16T17:06:53Z
Finalizers:
wcpmachine.infrastructure.cluster.vmware.com
Generate Name: test-cluster-control-plane-
Generation: 2
Owner References:
API Version: run.tanzu.vmware.com/v1alpha1
Block Owner Deletion: true
Controller: true
Kind: TanzuKubernetesCluster
Name: test-cluster
UID: 5e434bbe-7b24-4fbd-81c0-aecc72326002
API Version: cluster.x-k8s.io/v1alpha2
Kind: Machine
Name: test-cluster-control-plane-jwsx7
UID: c0666054-570b-467f-8a20-b719d267276b
Resource Version: 435441
Self Link: /apis/infrastructure.cluster.vmware.com/v1alpha2/namespaces/ben-test/wcpmachines/test-cluster-control-plane-jwsx7
UID: 7902645a-5ee6-4488-9691-7502bb90238a
Spec:
Class Name: best-effort-small
Image Name: photon-3-k8s-v1.17.4---vmware.1-tkg.1.0dba899
Provider ID: vsphere://423697d9-706c-3e92-7514-088978955321
Storage Class: gc-storage-profile
Status:
Ready: true
Vm ID: 423697d9-706c-3e92-7514-088978955321
Vm Ip: 172.26.1.18
Vmstatus: ready
Events: <none>
Only thing worth noting is the Provider ID
, which as you can see is the VM biosUUID. A unique infrastructure ID is part of the integration with CAPI and that's why it's part of the Spec
and not the Status
.
As you'll see from the diagram, worker machines are generated by a MachineSet rather than being explicitly defined when the cluster YAML is applied. This is so that the workers can be scaled up (and in theory, down - although we don't currently support that).
The owner ref hierarchy of worker machines is a little different.
- The Machine has an owner ref to its MachineSet. In contrast, a control plane Machine has a reference to the Cluster and the TKC
- The WCPMachine has an owner ref to both the MachineSet and the Machine it corresponds to. In contrast, a control plane WCPMachine refers to its associated Machine and the TKC
A final difference worth noting is that the name of a worker WCPMachine does not match the name of its Machine. This is because they receive generated names independently. This is in contrast to control plane Machines and WCPMachines which have the same name
VM Operator provides a declarative means of manipulating vSphere infrastructure - primarily VirtualMachines. The VM Operator API is now in GitHub at http://github.com/vmware-tanzu/vm-operator-api
The main Kubernetes object of interest in the VM Operator layer is, of course, VirtualMachine.
Let's look at one:
$ kubectl describe virtualmachine test-cluster-control-plane-jwsx7
Name: test-cluster-control-plane-jwsx7
Namespace: ben-test
Labels: capw.vmware.com/cluster.name=test-cluster
capw.vmware.com/cluster.role=controlplane
Annotations: vsphere-cluster-module-group: control-plane-group
vsphere-tag: CtrlVmVmAATag
API Version: vmoperator.vmware.com/v1alpha1
Kind: VirtualMachine
Metadata:
Creation Timestamp: 2020-04-16T17:07:10Z
Finalizers:
virtualmachine.vmoperator.vmware.com
Generation: 1
Owner References:
API Version: infrastructure.cluster.vmware.com/v1alpha2
Block Owner Deletion: true
Controller: true
Kind: WCPMachine
Name: test-cluster-control-plane-jwsx7
UID: 7902645a-5ee6-4488-9691-7502bb90238a
Resource Version: 442007
Self Link: /apis/vmoperator.vmware.com/v1alpha1/namespaces/ben-test/virtualmachines/test-cluster-control-plane-jwsx7
UID: dff16c1e-8306-4cad-b911-2f9aa2ffcce5
Spec:
Class Name: best-effort-small
Image Name: photon-3-k8s-v1.17.4---vmware.1-tkg.1.0dba899
Network Interfaces:
Network Name: test-cluster-vnet
Network Type: nsx-t
Power State: poweredOn
Readiness Probe:
Tcp Socket:
Port: 6443
Resource Policy Name: test-cluster
Storage Class: gc-storage-profile
Vm Metadata:
Config Map Name: test-cluster-control-plane-jwsx7-cloud-init
Transport: ExtraConfig
Status:
Bios UUID: 423697d9-706c-3e92-7514-088978955321
Host: 10.185.22.162
Phase: Created
Power State: poweredOn
Unique ID: vm-1077
Vm Ip: 172.26.1.18
Events: <none>
The most interesting thing about this VirtualMachine is what's not there as opposed to what is there. Where is all of the cloud-init config that was in the Machine's bootstrap data? As you may have spotted, it's actually in a ConfigMap that's stored separate from the VirtualMachine. THis makes sense because it's a set of KV pairs.
Note that the transport specified is ExtraConfig, so VM Operator uses the GuestInfo ExtraConfig to apply the cloud-init metadata.
Here is are the contents of that ConfigMap with key-value pairs that are directly applied to the VM's ExtraConfig:
$ kubectl describe configmap test-cluster-control-plane-jwsx7-cloud-init
Name: test-cluster-control-plane-jwsx7-cloud-init
Namespace: ben-test
Labels: <none>
Annotations: <none>
Data
====
guestinfo.metadata:
----
Cmluc3RhbmNlLWlkOiAidGVzdC1jbHVzdGVyLWNvbnRyb2wtcGxhbmUtandzeDciCmxvY2FsLWhvc3RuYW1lOiAidGVzdC1jbHVzdGVyLWNvbnRyb2wtcGxhbmUtandzeDciCgpjb250cm9sUGxhbmVFbmRwb2ludDogIjE5Mi4xNjguMTIzLjM6NjQ0MyIKCg==
guestinfo.metadata.encoding:
----
base64
guestinfo.userdata:
----
IyMgdGVtcGxhdGU6IGppbmphCiNjbG91ZC1jb25maWcKCndyaXRlX2ZpbGVzOgotICAgcGF0aDogL2V0...
...
guestinfo.userdata.encoding:
----
base64
Events: <none>
As you may be aware, VirtualMachineImage is a mechanism VM Operator has of notifying the user what Images are available for creating VirtualMachines. Note that VM Operator is currently opinionated to a clone mechanism where a VirtualMachine is always cloned from an Image. There is currently no way of specifying ISOs or OVAs directly.
A VirtualMachineImage from a Guest Clusters standpoint is therefore a reflection of the available K8S versions that can be specified:
$ kubectl get VirtualMachineImages
NAME VERSION OSTYPE
photon-3-k8s-v1.17.4---vmware.1-tkg.1.0dba899 v1.17.4+vmware.1-tkg.1.0dba899 vmwarePhoton64Guest
You don't need to specify the full version when you create a TKC. You can just use the short version.
VirtualMachineClass is a way of encapsulating a t-shirt size and associated settings for a VM. Guest Clusters comes out of the box with a variety of these VirtualMachineClasses built-in:
$ kubectl get VirtualMachineClasses
NAME AGE
best-effort-large 31h
best-effort-medium 31h
best-effort-small 31h
best-effort-xlarge 31h
best-effort-xsmall 31h
guaranteed-large 31h
guaranteed-medium 31h
guaranteed-small 31h
guaranteed-xlarge 31h
guaranteed-xsmall 31h
"Best Effort" in this context means that the memory and CPU are not reserved. A "Guaranteed" class means that the CPU and Memory are fully reserved
When a persistent volume is created for a Pod in a Guest Cluster, it must be attached and mounted to the node the pod is running in. This is typically achieved via a Cloud Provider, but in Guest Clusters, it is achieved paravirtially.
A request for a PersistentVolumeClaim from a particular StorageClass a Guest Cluster has access to causes a VMDK-based PersistentVolume and PersistentVolumeClaim to be created in the Supervisor Cluster. When a pod is deployed to the Guest Cluster that uses the PVC, it needs to be associated with the VirtualMachine that represents the node. This is actually a reconfiguration of the Spec of the VirtualMachine by the CSI control plane (via CNS integration) and the attach/detach is then handled by VM Operator.
Note the addition of the Volumes
clause in the Spec
of the worker VirtualMachine running the pod:
...
Spec:
Class Name: best-effort-small
Image Name: photon-3-k8s-v1.17.4---vmware.1-tkg.1.0dba899
Network Interfaces:
Network Name: test-cluster-vnet
Network Type: nsx-t
Power State: poweredOn
Resource Policy Name: test-cluster
Storage Class: gc-storage-profile
Vm Metadata:
Config Map Name: test-cluster-workers-hktqw-v6wss-cloud-init
Transport: ExtraConfig
Volumes:
Name: 5e434bbe-7b24-4fbd-81c0-aecc72326002-427281a3-780b-41dd-80d4-165c7624c31d
Persistent Volume Claim:
Claim Name: 5e434bbe-7b24-4fbd-81c0-aecc72326002-427281a3-780b-41dd-80d4-165c7624c31d
...
This isn't necessarily explicitly related to lifecycle, but it's definitely interesting to note that VM configuration is all handled through the VIrtualMachine object.