Skip to content
This repository has been archived by the owner on Jun 28, 2023. It is now read-only.

Verify TMC Integration works for registering TCE clusters #2992

Closed
joshrosso opened this issue Feb 1, 2022 · 18 comments
Closed

Verify TMC Integration works for registering TCE clusters #2992

joshrosso opened this issue Feb 1, 2022 · 18 comments
Labels
kind/feature A request for a new feature owner/core-eng Work executed by TCE's core engineering team owner/framework Work executed in vmware-tanzu/tanzu-framework reporting/health/on-track
Milestone

Comments

@joshrosso
Copy link
Contributor

joshrosso commented Feb 1, 2022

Summary

As part of our v0.12.0 release, we are required to ensure TMC can attach and register to TCE clusters. Attach has historically worked, but register had specific requirements that have been resolved in newer versions of TKG/Tanzu-Framework.

Proposal

In order to achieve the above, there are two primary needs:

  1. TMC can identify a TCE-bootstrapped cluster
  2. TMC can use the same tanzu CLI to interact with TCE management clusters

Identify a TCE Management Cluster

When TMC registers to a management-cluster, it needs a way to detect that the management-cluster was produced by TCE. This enables knowledge of things like:

  • What package repos might be available
  • What TKr BOM to use when creating workload clusters

The above are not guarantees of things TMC will do, but more so demonstrate the need.

In order to accomplish this, we'll ensure TCE-created clusters receive an edtion: tce annotation. The TKGVERSION annotation will continue to work and will indicate the version of our Bill of Materials. An example structure is.

apiVersion: v1
items:
- apiVersion: cluster.x-k8s.io/v1beta1
  kind: Cluster
  metadata:
    annotations:
      TKGOperationInfo: '{"Operation":"Create","OperationStartTimestamp":"2022-02-01
        22:26:33.254455384 +0000 UTC","OperationTimeout":1800}'
      TKGOperationLastObservedTimestamp: 2022-02-01 22:26:48.297163967 +0000 UTC
      TKGVERSION: v1.6.0-zshippable
      edition: tce

By default, this may also mean edition: tkg is detectable. However, the TCE project does not make any guarantees as to what TKG propagates for this value.

Use the same tanzu CLI to interact with TCE management clusters

Historically, we've compiled binaries from tanzu framework to support TCE-specific needs. However, TMC needs to be able to use the same CLI in order to interact with TCE management clusters. We will support this model by setting configuration:

tanzu config set cli.edition tce

This corresponds to the clientConfig found in ~/.config/tanzu/config.yaml

apiVersion: config.tanzu.vmware.com/v1alpha1
clientOptions:
  cli:
    compatabilityFileLocation: projects.registry.vmware.com/tce/tkg-compatibility
    discoverySources:
    - local:
        name: default-local
        path: standalone
    - local:
        name: admin-local
        path: admin
    edition: tce
    repositories:
    - gcpPluginRepository:
        bucketName: tanzu-cli-framework
        name: core
    unstableVersionSelector: none
  features:
    cluster:
      custom-nameservers: "false"
      dual-stack-ipv4-primary: "false"
      dual-stack-ipv6-primary: "false"
    global:
      context-aware-cli-for-plugins: "true"
    management-cluster:
      custom-nameservers: "false"
      dual-stack-ipv4-primary: "false"
      dual-stack-ipv6-primary: "false"
      export-from-confirm: "true"
      import: "false"
      network-separation-beta: "false"
      standalone-cluster-mode: "false"
kind: ClientConfig
metadata:
  creationTimestamp: null

A notable impact of running the command above is that it will:

  • Clear out the existing ~/.config/tanzu/tkg/compatibility file.
  • Set the compatibility file path to projects.registry.vmware.com/tce/tkg-compatibility

The above will ensure all management-cluster interactions going forward work off the TCE-specific BOM files.

Issue Tracking:

TMC must be able to detect a management-cluster came from TCE

TMC must be able to use the tanzu-framework library to create TCE-specific workload clusters

To support the above, TCE must build with the newest version of Framework

@joshrosso joshrosso added triage/needs-triage Needs triage by TCE maintainers kind/feature A request for a new feature labels Feb 1, 2022
@joshrosso joshrosso added this to the v0.11.0 milestone Feb 1, 2022
@stmcginnis stmcginnis added owner/framework Work executed in vmware-tanzu/tanzu-framework owner/core-eng Work executed by TCE's core engineering team and removed triage/needs-triage Needs triage by TCE maintainers labels Feb 1, 2022
@joshrosso joshrosso self-assigned this Feb 1, 2022
@joshrosso
Copy link
Contributor Author

@berndtj are you and/or someone from your team able to look at this proposal and let us know if it makes sense based on your understanding of needs for TCE+TMC interop?

@mfine30
Copy link

mfine30 commented Feb 3, 2022

Overall this looks like what I would expect, at least from a product perspective. I have a few questions just to confirm things though.

I see an Issue titled: "TMC must be able to use the tanzu-framework library to create TCE-specific workload clusters" but don't understand enough of the actual issue under it #2716 to know if that will solve the tile description. I assume it will since you linked them though?

Will TMC also be able to understand that the workload clusters are TCE workload clusters or is it just the management clusters? I'd vote for both if possible.

@joshrosso
Copy link
Contributor Author

@mfine30 thanks for the follow-up.

I see an Issue titled: "TMC must be able to use the tanzu-framework library to create TCE-specific workload clusters" but don't understand enough of the actual issue under it #2716 to know if that will solve the tile description. I assume it will since you linked them though?

Yes, #2716 aims to solve a case where the tanzu management-cluster and cluster CLI plugins (from tanzu-framework) can operate against TCE and TKG clusters with the same binary.

I'd need guidance from TMC if more than this is needed.

Will TMC also be able to understand that the workload clusters are TCE workload clusters or is it just the management clusters? I'd vote for both if possible.

Based on the current plumbing, TMC will know that the workload clusters came from TMC by nature of being registered to t he management cluster that made them. However, it will not be obvious in the same way by looking only at the workload cluster.

Can ya'll expand on whether this is a requirement or more-so a nice to have?

@berndtj
Copy link

berndtj commented Feb 3, 2022

I think what @mfine30 is getting at is that ideally a workload cluster attached only (i.e. registered management cluster) should also be detectable by TMC as coming from TCE (or TKG for that matter).

cc: @steven-zou

@mfine30
Copy link

mfine30 commented Feb 3, 2022

Here are the user stories that I have in mind driving this:

  1. As a TMC/TCE user, I can register a TCE management cluster to TMC
  2. As a TMC/TCE user with a registered TCE management cluster, I can use that management cluster to provision a TCE workload cluster
  3. As a TMC/TCE user, I can identify that a management cluster registered to TMC is TCE vs TKG
  4. As a TMC/TCE user, I can identify that a TCE workload cluster provisioned by TMC/TCE is a TCE vs TKG
  5. As a TMC/TCE user, with a TCE workload cluster that I created outside of TMC (and the management cluster is not registered to TMC), when I attach the workload cluster to TMC I can tell that it is a TCE workload cluster

@joshrosso
Copy link
Contributor Author

joshrosso commented Feb 3, 2022

@berndtj @mfine30, makes sense.

Can you please verify 4 and 5 are required out of the gate?

Or, can 1-3 suffice for the initial release?

@mfine30
Copy link

mfine30 commented Feb 3, 2022

Yes I'd consider 4 and 5 as required as well. I worry that only doing 1-3 creates a sub-par UX that looks to customers like we forgot to finish it.

@joshrosso
Copy link
Contributor Author

joshrosso commented Feb 3, 2022

k, i'll spec it out in the proposal above.

@mfine30 and @berndtj how do you do this detection for 4 and 5 with TKG workload clusters today?

In other words, how do you look just at a arbitrary cluster and determine it's a TKG cluster.

@mfine30
Copy link

mfine30 commented Feb 4, 2022

@joshrosso – thanks for pushing on that and making us think through it. From chatting with @berndtj, I realized that I thought we did more in TMC than TMC does today. TMC currently only shows the cloud vs the provider for an attached cluster (i.e. it's a vSphere cluster for an attached cluster).

I think that means for this track of work:

  • item 4 can be inferred by TMC based on the management cluster rather than introspecting the workload cluster; @berndtj you ok with that approach?
  • item 5 should not be considered a requirement

@joshrosso
Copy link
Contributor Author

joshrosso commented Feb 4, 2022

perfect, that's exactly what I was hoping ya'll would say.

Make senses and I do agree it'd be a solid future-term thing to expose.

@berndtj can you throw us a 👍 you're on the same page with what @mfine30 said?

@steven-zou
Copy link

From the TMC LCM engineering perspective, we'd like the current TKG LCM extension can also support managing the LCM of TCE clusters to reduce efforts and code complexities. To achieve this goal, there are some needs that need to be met.

The first and top thing is the PRIMARY NEED2 of this proposal:

TMC can use the same tanzu CLI to interact with TCE management clusters

It should be noted here that the current TKG LCM extension is referring to the client library in the mirror tanzu framework gitlab repository core-build/mirrors_github_tanzu-framework instead of the github tanzu framework github.com/vmware-tanzu/tanzu-framework. I'm not sure if there are incompatibility issues between these two repos when supporting TCE. We have to think about it ahead.

Then some metadata for detecting and identifying TKG/TCE clusters, might include:

  • metadata.annotations[TKGVERSION] : for getting TKG/TCE version
  • metadata.labels[tanzuKubernetesRelease]: for upgrading
  • metadata.annotations[edition]: for distinguishing TKG/TCE (so far, hardcoded edition related info to TKG at TMC side)

cc @berndtj and @renmaosheng

@steven-zou
Copy link

steven-zou commented Feb 7, 2022

Additionally, story item 5 is really not part of TKG LCM stories.

@joshrosso
Copy link
Contributor Author

It should be noted here that the current TKG LCM extension is referring to the client library in the mirror tanzu framework gitlab repository core-build/mirrors_github_tanzu-framework instead of the github tanzu framework github.com/vmware-tanzu/tanzu-framework. I'm not sure if there are incompatibility issues between these two repos when supporting TCE. We have to think about it ahead.

Who can figure this compatibility question out?

As part of this work, we'll ensure the upstream libs (tanzu-framework) have this capability. Who will own the reconciliation of the mirror/downstream dep?

@steven-zou
Copy link

steven-zou commented Feb 7, 2022

Who can figure this compatibility question out?

As part of this work, we'll ensure the upstream libs (tanzu-framework) have this capability. Who will own the reconciliation of the mirror/downstream dep?

There might be some commits differences between these two repos. I think it might be a little hard to use human efforts to ensure compatibility. Maybe via the automation pipelines?

@renmaosheng
Copy link

@Maharshi Bhatt is the major contact from TKG downstream side, we can confirm with Maharshi on the compatibility questions. to Echo Steven Zou's point, we want the TKG/TCE release can share the same tanzu-framework sha, so that TMC doesn't need to treat TKG/TCE separately. we are also working with TKG to define an API instead of using the client library code, we want to align with TCE as well to use the same API in a long-term perspective.

@joshrosso
Copy link
Contributor Author

@renmaosheng

we want the TKG/TCE release can share the same tanzu-framework sha, so that TMC doesn't need to treat TKG/TCE separately.

Unfortunately, this will not be possible. Tanzu-framework should provide backwards compatibility guarantees. We cannot pin the community project on versions of our downstream product.

@joshrosso
Copy link
Contributor Author

Moving this out to re-evaluate in v0.12.0.

We've decided to take a slightly different approach to ensure we can ship v0.11.0.

That work is tracked here: #3285

@joshrosso joshrosso removed their assignment Mar 4, 2022
@joshrosso joshrosso removed this from the v0.11.0 milestone Mar 4, 2022
@joshrosso joshrosso added this to the v0.12.0 milestone Mar 4, 2022
@joshrosso joshrosso modified the milestones: v0.12.0, v0.11.0 Mar 21, 2022
@joshrosso
Copy link
Contributor Author

The original intent of registration compatibility was solved for the v0.11.0 release. Enhancements to further support TMC should be opened in separate issues, for example:

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/feature A request for a new feature owner/core-eng Work executed by TCE's core engineering team owner/framework Work executed in vmware-tanzu/tanzu-framework reporting/health/on-track
Projects
None yet
Development

No branches or pull requests

6 participants