Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: define the CRD of nested components #11

Merged

Conversation

charleszheng44
Copy link
Contributor

  1. Define the CRD of the nested components
  2. Define a standard way of creating nested components

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 10, 2020
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 10, 2020
@charleszheng44
Copy link
Contributor Author

/resolve #7

@brightzheng100
Copy link

It'd be perfect if we could illustrate where the CRs and instances are created/placed, especially under the context of multiple tenant / nested clusters are supported, in the architectural diagrams.

@charleszheng44
Copy link
Contributor Author

@brightzheng100 thanks for the suggestion. I will edit the proposal and explain where each CR will be placed.

In short: The NCPT is a cluster scope CRD, the NCP will be placed in the namespace specified by the end-users, and the Nested CRs will located in the clusternamespace created by the NCP controller.

Copy link
Member

@christopherhein christopherhein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome job on this @charleszheng44 added some suggestions throughout.

proposals/20201026-creating-control-plane.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane.md Outdated Show resolved Hide resolved
@christopherhein
Copy link
Member

christopherhein commented Nov 11, 2020

Original message:

In short: The NCPT is a cluster scope CRD, the NCP will be placed in the namespace specified by the end-users, and the Nested CRs will located in the clusternamespace created by the NCP controller.

See my suggestions as well, I wonder if we could implement this part slightly different for NCPs vs how it was done in VC.

@christopherhein
Copy link
Member

christopherhein commented Nov 12, 2020

@Fei-Guo did you mean to edit my message? 😛

@christopherhein
Copy link
Member

From @Fei-Guo with regards to my message about NCPT

See my suggestions as well, I wonder if we could implement this part slightly different for NCPs vs how it was done in VC.

If we plan to let user create CRs by himself, it is likely he has to to deal with setting proper owner references for all the CRs.
Let NCP controller handle CR creations will ease user.

Anyway, if we design a mechanism to identify all the associated components, we may be able to workaround the owner reference limitation. All we need is that each component has a way to find other components.

@christopherhein
Copy link
Member

christopherhein commented Nov 12, 2020

If we plan to let user create CRs by himself, it is likely he has to to deal with setting proper owner references for all the CRs.
Let NCP controller handle CR creations will ease user.

Totally, I think we need to have some way, like using a controlPlaneRef on each CR component https://github.com/kubernetes-sigs/cluster-api-provider-nested/pull/11/files/4b6d8aea98ed04eea543f2e29319567d4bab5207#diff-1002f0da6945eba89a0115988309b5629a0466ae80661df7859daf855f6ea9f1R277 which ties it back to the owning NCP. This would be easy for a user to setup. I agree on ease of use, but concerned about making something a central orchestrator as the NCP when that is probably better suited for higher-level abstractions.

I'm worried if we add too much automation upfront then we'll have a difficult time rolling it back, it would be relatively easy to eventually add "creating the CRs" in a NestedControlPlaneTemplate controller later down the line to add that same type of higher-level abstractions back into CAPN.

@Fei-Guo
Copy link

Fei-Guo commented Nov 12, 2020

Yes. ControllerplanRef works which is exactly in the proposal here. How to find associated CRs from NCP? We propose to add objectRef in NCP status. Do you agree or have some other ideas?

@christopherhein
Copy link
Member

christopherhein commented Nov 12, 2020

Yes. ControllerplanRef works which is exactly in the proposal here. How to find associated CRs from NCP? We propose to add objectRef in NCP status. Do you agree or have some other ideas?

We could use metadata.OwnerReferences and in the NCP controller configure a custom cache index for each of the CRs to make the look up fast. this is predicated on #11 (comment)

@charleszheng44
Copy link
Contributor Author

charleszheng44 commented Nov 16, 2020

Yes. ControllerplanRef works which is exactly in the proposal here. How to find associated CRs from NCP? We propose to add objectRef in NCP status. Do you agree or have some other ideas?

We could use metadata.OwnerReferences and in the NCP controller configure a custom cache index for each of the CRs to make the look up fast. this is predicated on #11 (comment)

I think the problem is how can one CR controller get another CR, for example, KAS controller may wanna to get the etcd address that is stored in the Etcd CR. We could use the metadata.OwnerReference to get the NCP CR but I am not sure how can we use a custom cache index in the NCP controller to help the KAS controller to get the corresponding Etcd CR. But we can use the Selector field in the NCP status to retrieve corresponding CRs as mentiond in the previous comment

Comment on lines 131 to 138
// NestedEtcd contains information required by the etcd provider to create the Etcd
NestedEtcd NestedComponentSpec `json:"nestedEtcd"`

// NestedKAS contains information required by the apiserver provider to create the apiserver
NestedKAS NestedComponentSpec `json:"nestedKAS"`

// NestedKCM contains information required by the controller-manager provider to create the controller-manager
NestedKCM NestedComponentSpec `json:"nestedKCM"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// NestedEtcd contains information required by the etcd provider to create the Etcd
NestedEtcd NestedComponentSpec `json:"nestedEtcd"`
// NestedKAS contains information required by the apiserver provider to create the apiserver
NestedKAS NestedComponentSpec `json:"nestedKAS"`
// NestedKCM contains information required by the controller-manager provider to create the controller-manager
NestedKCM NestedComponentSpec `json:"nestedKCM"`
// Etcd contains information required by the etcd provider to create the Etcd
Etcd NestedComponentSpec `json:"etcd"`
// ApiServer contains information required by the apiserver provider to create the apiserver
ApiServer NestedComponentSpec `json:"apiServer"`
// ControllerManager contains information required by the controller-manager provider to create the controller-manager
ControllerManager NestedComponentSpec `json:"controllerManager"`

Given that all of these are already in a Nested struct

}
```

We will let users specify component providers through NCP, implement a native mode that has each sub-controller to create the component on the super cluster, and define three new types `EtcdProvider`, `KASProvider`, and the `KCMProvider`:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we run all controllers all the time instead of allowing to turn them on/off? If controllers aren't reconciling anything because there are no resources / events, it should be fine?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here would be that in the config (or flags 🤮 ) we can turn off specific controllers and if you wanted to use an external etcd cluster provider like etcdadm or etcd-cluster-operator you can choose to flip off the built-in provider and you just supply your own controller listening for NEtcd and creating w/e the necessary components for that implementation. This will change a bit of the implementation as per - https://github.com/kubernetes-sigs/cluster-api-provider-nested/pull/11/files/4b6d8aea98ed04eea543f2e29319567d4bab5207#r521166698

Comment on lines 166 to 180
type NestedControlPlaneSpec struct {
// other fields....

// NestedEtcdProvider specifies which EtcdProvider will be used to create the Etcd
// +optional
NestedEtcdProvider EtcdProvider `json:"etcdProvider,omitempty"`

// NestedKASProvider specifies which KASProvider will be used to create the kube-apiserver
// +optional
NestedKASProvider KASProvider `json:"kasProvider,omitempty"`

// NestedKCMProvider specifies which KCMProvider will be used to create the kube-controller-manager
// +optional
NestedKCMProvider KCMProvider `json:"kcmProvider,omitempty"`
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
type NestedControlPlaneSpec struct {
// other fields....
// NestedEtcdProvider specifies which EtcdProvider will be used to create the Etcd
// +optional
NestedEtcdProvider EtcdProvider `json:"etcdProvider,omitempty"`
// NestedKASProvider specifies which KASProvider will be used to create the kube-apiserver
// +optional
NestedKASProvider KASProvider `json:"kasProvider,omitempty"`
// NestedKCMProvider specifies which KCMProvider will be used to create the kube-controller-manager
// +optional
NestedKCMProvider KCMProvider `json:"kcmProvider,omitempty"`
}
type NestedControlPlaneSpec struct {
// other fields....
// EtcdProvider specifies which EtcdProvider will be used to create the Etcd
// +optional
EtcdProvider EtcdProvider `json:"etcdProvider,omitempty"`
// ApiServerProvider specifies which KASProvider will be used to create the kube-apiserver
// +optional
ApiServerProvider KASProvider `json:"apiServerProvider,omitempty"`
// ControllerManagerProvider specifies which KCMProvider will be used to create the kube-controller-manager
// +optional
ControllerManagerProvider KCMProvider `json:"controllerManagerProvider,omitempty"`
}

Same suggestion as above regarding naming.

On a more general note, do we need pluggable providers for the first iteration? Would it be ok to reduce scope a bit and only allow the built-in controllers for etcd/api-server/controller-manager instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't fully as it's documented IMO, this was called out a couple places. Specifically here - https://github.com/kubernetes-sigs/cluster-api-provider-nested/pull/11/files/4b6d8aea98ed04eea543f2e29319567d4bab5207#r521166698 As long as we have a way to swap the controllers for specific implementations that would be sufficient, IE I could use an etcd prefix or an operator backing the NEtcd controller instead and manage that out-of-tree.

I'm very hesitant to add too much pluggability upfront except controlling what controller manages each object. Since there are so much prior art in the OSS for etcd controllers and we should be able to support those without forking.

@vincepri
Copy link
Member

@christopherhein @Fei-Guo @charleszheng44 Are there concrete use cases that we have today for having multiple CR references inside linked to the main resource? It seems from prior conversations (been catching up today) that all of these resources are effectively linked together and we might want to reconsider having a pluggable abstraction layer until later on.

In other words, consider starting from a (mild) opinionated approach that doesn't allow pluggable controllers, we could always get there with time and as use cases occur, but for now it seems we can iterate an learn by having a simplified model.

What do you folks think?

@christopherhein
Copy link
Member

christopherhein commented Nov 16, 2020

@christopherhein @Fei-Guo @charleszheng44 Are there concrete use cases that we have today for having multiple CR references inside linked to the main resource? It seems from prior conversations (been catching up today) that all of these resources are effectively linked together and we might want to reconsider having a pluggable abstraction layer until later on.

In other words, consider starting from a (mild) opinionated approach that doesn't allow pluggable controllers, we could always get there with time and as use cases occur, but for now it seems we can iterate an learn by having a simplified model.

What do you folks think?

The thought process with individual controllers for each component is it allows us to build more stable upgrade flows for nested control planes. If we have each component use standard labels and statuses we can have etcd staying stable at one version while the apiserver or controller manager is being in-place upgraded through standard kubernetes mechanisms. What you are referencing is much like what we have with ClusterVersion's in today's VC which starts to break down as you need to have more customized control planes and upgrades, for example how we have the ability to change flags on a per control plane basis and my big use case for "pluggability" (which is sort of a hack in how I'd like to see it) relies on someone else building their own controller that supports Nested* and just turns off the in-tree controler this is to support the prior art for these components which I don't think CAPN needs to fully own, specifically etcd.

I agree to start smaller and opinionated is better upfront and the in-tree providers should be able to do this just requiring the user to create those CRs when bringing up the cluster eg. Create NC, NCP, NEtcd, NKAS and NCM and just reference back to the owning objects, for the most part, those should be nearly blank unless you need to override something specific like adding a FeatureGate.

WDYT? Would this cause concerns?

@vincepri
Copy link
Member

This all makes sense, thanks for providing some more color.

Create NC, NCP, NEtcd, NKAS and NCM and just reference back to the owning objects, for the most part, those should be nearly blank unless you need to override something specific like adding a FeatureGate.

If the references are left blank, we could have the main controller to default-create the in-tree controllers on its own, further simplifying the UX. We could also make sure of controller owner references to signal that these CR are managed by the main controller

@christopherhein
Copy link
Member

christopherhein commented Nov 16, 2020

This all makes sense, thanks for providing some more color.

Create NC, NCP, NEtcd, NKAS and NCM and just reference back to the owning objects, for the most part, those should be nearly blank unless you need to override something specific like adding a FeatureGate.

If the references are left blank, we could have the main controller to default-create the in-tree controllers on its own, further simplifying the UX. We could also make sure of controller owner references to signal that these CR are managed by the main controller

Yea, we discussed this a little bit on the last call. I agree it could be created up-front by the NCP although we'd need a way to reconcile since in-theory we'd have someone kubectl apply -f ./ the list of resources and it might create NCP before NEtcd since we don't have a "stack-like" object so if NCP created NEtcd and then the local NEtcd CR was applied it could fail to be created because of conflicts, maybe SSA would help.

My initial thought was since we eventually plan to bring back NCPT's we could use that as the higher-level orchestration/abstraction and it could create those CRs and the control plane namespace to give the same virtualcluster/clusterversion style experience but doesn't force us to manage those flows in the low-level component controllers.

Have any suggestions, is there prior implementations doing something like this we could use as a reference?

@vincepri
Copy link
Member

Yea, we discussed this a little bit on the last call. I agree it could be created up-front by the NCP although we'd need a way to reconcile since in-theory we'd have someone kubectl apply -f ./ the list of resources and it might create NCP before NEtcd since we don't have a "stack-like" object so if NCP created NEtcd and then the local NEtcd CR was applied it could fail to be created because of conflicts, maybe SSA would help.

I might be missing something (also getting lost with all the acronyms :)), if a user kubectl apply a bunch of yaml, and they have a custom provider for NestedEtcd, I'd expect them to setup the ObjectReference in NCP correctly? Otherwise, as you said, the NCP controller is going to default it and create one on its own

@christopherhein
Copy link
Member

christopherhein commented Nov 16, 2020

I might be missing something (also getting lost with all the acronyms :)), if a user kubectl apply a bunch of yaml, and they have a custom provider for NestedEtcd, I'd expect them to setup the ObjectReference in NCP correctly? Otherwise, as you said, the NCP controller is going to default it and create one on its own

Doh' sorry I'm going off the expectation that there wasn't a bidirectional references between NCP to N* components, you are right that would solve the problem, I was going off the expectation that the NCP didn't have those and the component CRs were the only thing referencing back to an NCP.

If NCP has:

    // EtcdProvider specifies which EtcdProvider will be used to create the Etcd
    // +optional
    EtcdProvider EtcdProvider `json:"etcdProvider,omitempty"`
     
    // ApiServerProvider specifies which KASProvider will be used to create the kube-apiserver
    // +optional
    ApiServerProvider KASProvider `json:"apiServerProvider,omitempty"`
 
    // ControllerManagerProvider specifies which KCMProvider will be used to create the kube-controller-manager
    // +optional
    ControllerManagerProvider KCMProvider `json:"controllerManagerProvider,omitempty"`

and each Nested Component CR has an OwnerReference that would suffice. Sorry for the confusion.

Copy link
Member

@christopherhein christopherhein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple more nits, this is looking really good. thanks @charleszheng44

proposals/20201026-creating-control-plane-components.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane-components.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane-components.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane-components.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane-components.md Outdated Show resolved Hide resolved

1. The user generates all CRs and apply them.

2. As the user chooses to use the etcd-cluster-operator(ECO), the ECO controller will create the EtcdCluster CR
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe #1 should be deployed custom etcd controller implementation, #2 turns off in-tree etcd. I feel like chooses doesn't cover what they are doing.

proposals/20201026-creating-control-plane-components.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane-components.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane-components.md Outdated Show resolved Hide resolved
@charleszheng44 charleszheng44 force-pushed the proposal/master-controllers branch 2 times, most recently from da045fe to bdf20be Compare November 18, 2020 23:10
Copy link
Member

@christopherhein christopherhein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking really good, couple more questions and a suggestion on NestedControlPlaneSpec.

proposals/20201026-creating-control-plane-components.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane-components.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane-components.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane-components.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane-components.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane-components.md Outdated Show resolved Hide resolved
proposals/20201026-creating-control-plane-components.md Outdated Show resolved Hide resolved
@christopherhein
Copy link
Member

Once these last things are cleared up can you also squash the commits and we can get this merged and broken down to individual issues.

@charleszheng44
Copy link
Contributor Author

charleszheng44 commented Nov 19, 2020

Once these last things are cleared up can you also squash the commits and we can get this merged and broken down to individual issues.

Yes, just squashed the commits and create a new PR for moving terms to glossary #17 .

@christopherhein
Copy link
Member

Nice work @charleszheng44, I'm good to go on this.

/approve

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 19, 2020
@christopherhein
Copy link
Member

I would like someone else to give it lgtm. Just to have more consensus before.

@christopherhein
Copy link
Member

/kind design

@k8s-ci-robot k8s-ci-robot added the kind/design Categorizes issue or PR as related to design. label Nov 19, 2020
@christopherhein
Copy link
Member

/priority important-soon

@k8s-ci-robot k8s-ci-robot added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Nov 19, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: charleszheng44, christopherhein

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Fei-Guo Fei-Guo merged commit 00ca26f into kubernetes-sigs:master Nov 20, 2020
@charleszheng44 charleszheng44 deleted the proposal/master-controllers branch August 30, 2021 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/design Categorizes issue or PR as related to design. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants