Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Externalize provider specific specs and status in separated CRDs #833

Open
pablochacin opened this issue Mar 20, 2019 · 19 comments

Comments

@pablochacin
Copy link
Contributor

commented Mar 20, 2019

/kind feature

Describe the solution you'd like
Presently, Cluster and Machine CRDs include an opaque representation of provider specific description of these resources and their status:

ProviderSpec ProviderSpec `json:"providerSpec,omitempty"`

An alternative approach would be to use CRDs for provider specific specs and status and keep a ObjectReference

ProviderSpec metav1.ObjectReference

With respect of the provider specific status, the ClusterAPI controllers should watch the provider CRD and update the ClusterAPI CRD status if a change is detected.

Pros:

  • Explicit visibility of the dependency to providers by means of the object reference
  • Independent reconcile cycle for provider specific and ClusterAPI CRDs
  • Supporting tools like clusterctlcould use a plugin approach for handling provider specific specs and status

Cons:

  • Node identified so far.

Anything else you would like to add:

@ncdc

This comment has been minimized.

Copy link
Contributor

commented Mar 20, 2019

@pablochacin would you be willing to defer discussion on this while we plan out the organization we discussed in today's community around roadmap items, and using KEPs for things like this instead of individual issues?

@pablochacin

This comment has been minimized.

Copy link
Contributor Author

commented Mar 21, 2019

@ncdc Sure. My intention with this issue was to put together ideas that I had expressed as comments in other places (issues, documents) and facilitate the discussion.

@timothysc timothysc added this to the v1alpha2 milestone Apr 4, 2019

@timothysc

This comment has been minimized.

Copy link
Member

commented Apr 4, 2019

@mhrivnak

This comment has been minimized.

Copy link

commented Apr 29, 2019

I love a lot of things about this approach and think it could be very successful. Some thoughts:

Considering a provider "Foo", I would make a FooMachine CRD with FooMachineSpec and FooMachineStatus types, just like any other normal CRD. Is that what you are proposing? Then from a Machine I would reference a FooMachine resource via an ObjectReference. I'm not sure there is value in splitting FooMachine up into dedicated "Spec" and "Status" CRDs that themselves don't really have spec and status.

Enabling the cluster-api MachineController to watch FooMachine CRDs and queue their owner, a Machine, would I think cover most use cases. Is there a use case you can think of for having a FooMachine controller? From a controller design standpoint, I lean toward just reconciling the Machine, and letting that workflow utilize the FooMachine as necessary. Ditto for Cluster of course.

It's not clear to me how best to handle this from a MachineSet. How would we templatize the FooMachine? We could have a MachineSet reference its own copy of a FooMachine, and have it make a copy of that for each Machine it creates. Other ideas?

Lastly, I'll just correct the perception of how metalkube is using an extra CRD, because it's fairly different from this proposal. (I'm a main author of that provider). We did not make a CRD to substitute for ProviderSpec and ProviderStatus. We did make a CRD that does provisioning and hardware management. The BareMetalHost CRD results in an API that you can think of as the local cloud API. Our actuator, rather than talking to the AWS or GCP cloud APIs, instead talks to the BareMetalHost API. Hopefully that helps; in any case, I don't think metalkube is doing anything similar to this proposal.

@pablochacin

This comment has been minimized.

Copy link
Contributor Author

commented May 6, 2019

@mhrivnak thanks for the clarification regarding how metakube uses the CRD.

@pablochacin

This comment has been minimized.

Copy link
Contributor Author

commented May 6, 2019

@mhrivnak Your suggestion about using a generic controller watching the provider CRD seams interesting. I don't know if you are participating in the discussion about extension mechanism for the cluster-api. That would be a good place to present and discuss the idea. Interested?

@vincepri

This comment has been minimized.

Copy link
Member

commented Jun 10, 2019

/area api

@pablochacin

This comment has been minimized.

Copy link
Contributor Author

commented Jun 12, 2019

@vincepri @timothysc I would like to work on this issue (seems I cannot assign to my self, can I?)

However, I have two questions:

  1. This issue is previous to the work towards v1alpha2, so there's significant overlapping with the work on Machine Statues and Boostrapping.
  2. v1alpha2 doesn't targets any change in the cluster api/data model.
@ncdc

This comment has been minimized.

Copy link
Contributor

commented Jun 12, 2019

@pablochacin would you be willing to write up a proposal for the changes to the Cluster type to switch from inline to object references? I imagine this would also include removing the cluster Actuator interface (to match what's proposed in #997) and replacing it with a cluster infrastructure controller.

@pablochacin

This comment has been minimized.

Copy link
Contributor Author

commented Jun 12, 2019

@ncdc Yes, I would be interested. Now, I expect the cluster data model to change significantly, according to what we discussed in the data model workstream. For instance, the references to infrastructure and control plane objects. So it's timely to do this change? If so, I'm in.

Regarding the machine part, I'm not sure how to approach this change in coordination with the proposal we have on the table.

@detiber

This comment has been minimized.

Copy link
Member

commented Jun 12, 2019

@pablochacin I am in favor of including this for v1alpha2 assuming:

  • A proposal is created for it as suggested by @ncdc
  • We have commitment from one or more people to do the implementation
@ncdc

This comment has been minimized.

Copy link
Contributor

commented Jun 12, 2019

If we want to do this in "small" steps, I'd suggest the only change we make for starters is the combination of removing the inline providerSpec and providerStatus fields (replacing them with object references to "cluster infrastructure" provider-specific CRDs, whatever they may look like per provider) and switching from a cluster Actuator to a cluster infrastructure controller. I think this would get rough alignment/coordination with #997.

A possible next step after this, maybe for v1alpha3, could be to further break up the data model into infrastructure vs control plane vs other things.

@pablochacin

This comment has been minimized.

Copy link
Contributor Author

commented Jun 12, 2019

@ncdc sounds like a plan.

@ncdc

This comment has been minimized.

Copy link
Contributor

commented Jun 12, 2019

I'm going to mark this p0 and move it to the v1alpha2 milestone as this issue covers both Machine and Cluster providerSpec/Status and the current plan is at least to tackle the fields in Machine for v1alpha2. And if we can get the proposal for the Cluster changes approved & have someone sign up to do it, 👍!

If we need to split this up so we have 1 issue for Cluster, and a separate for Machine, please let me know @timothysc.

/milestone v1alpha2
/priority critical-urgent

@pablochacin pablochacin referenced this issue Jun 20, 2019

Merged

Add v1alpha2 types 🎉 #1051

3 of 3 tasks complete
@vincepri

This comment has been minimized.

Copy link
Member

commented Jul 3, 2019

/assign

@timothysc

This comment has been minimized.

Copy link
Member

commented Jul 3, 2019

@pablochacin @ncdc - could folks please update this issue.

@ncdc

This comment has been minimized.

Copy link
Contributor

commented Jul 3, 2019

Pablo to write a proposal.

@ncdc

This comment has been minimized.

Copy link
Contributor

commented Jul 22, 2019

/reopen

The proposal has merged but we haven't modified the code yet. That's happening in #1177

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Jul 22, 2019

@ncdc: Reopened this issue.

In response to this:

/reopen

The proposal has merged but we haven't modified the code yet. That's happening in #1177

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Jul 22, 2019

@vincepri vincepri referenced a pull request that will close this issue Jul 22, 2019

Open

Cluster controller and types for v1alpha2 #1177

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.