Claim portability using portable classes #743

hasheddan · 2019-09-04T05:16:07Z

This PR updates the default class design doc to include the iterations that have taken place since the original implementation, as well as a proposal to re-establish claim portability using Policies as discussed in #703 and #723.

Checklist

I have:

Run make reviewable to ensure this PR is ready for review.
Ensured this PR contains a neat, self documenting set of commits.
Updated any relevant documentation, examples, or release notes.
Updated the RBAC permissions in clusterrole.yaml to include any new types.

upbound-bot · 2019-09-04T05:25:44Z

upbound-bot · 2019-09-04T16:25:18Z

suskin

Neat! :coffee_parrot:

suskin · 2019-09-05T00:50:27Z

design/one-pager-default-resource-class.md

+the first iteration of default classes allowed developers to have the option to
+omit the class reference and rely on falling back to whatever operators deemed
+an appropriate default. The default resource class was distinguished via the
+`defaultForClaimKinds` field:


Where is the defaultForClaimKinds field? Since this says via the . . . field: and ends with a colon right before the example, I was expecting the example to have it, but it doesn't. Should this be updated to talk about the labeling thing that appears to be describing a default?

This is a typo that should be corrected! Thanks for pointing that out :)

suskin · 2019-09-05T00:57:03Z

design/one-pager-default-resource-class.md

+With the implementation of [strongly-typed resource
+classes](./one-pager-strongly-typed-class.md), the generic `ResourceClass`
+became obsolete and the `Policy` kind was introduced. Each claim kind had a
+corresponding policy kind (i.e. `MySQLInstancePolicy` for `MySQLInstance`,


Nitpick: these and other uses of i.e. seem like examples to me, which means they should probably be e.g. instead of i.e.

Appreciate the attention to detail, will correct!

suskin · 2019-09-05T00:58:13Z

design/one-pager-default-resource-class.md

+`ResourceClass` that specified
+`mysqlinstance.database.crossplane.io/default:true`, there were now many
+different class `kinds` that the claim could potentially reference (i.e. GCP
+`CloudsqlInstanceClass`, AWS `RDSInstanceClass`, etc.). `Policies` were


I like that I feel like I'm reading a story right now 📖

I should probably change "there were now many" to just "there were many". I took a very different approach with this design doc in that I tried to preserve historical thinking because I believe it informs our current direction. I think the verdict is still out on if this is a good way to propose updated designs though.

I'm a strong believer in including this kind of context in design documents. I like to include it in a "background" section ahead of any goals or proposals, presuming that my audience has limited familiarity with the topic at hand. Even folks who know the project well might be rusty on on the deep technical subject matter, so it helps you to help them reach the same conclusion you have by telling a story.

suskin · 2019-09-05T00:59:55Z

design/one-pager-default-resource-class.md


-Currently, each Crossplane resource kind (i.e. GKE Cluster, AWS S3 Bucket, etc.) has a controller that reconciles claims for that resource by binding them to the corresponding managed type. These controllers use [predicates](https://github.com/crossplaneio/crossplane/blob/master/pkg/resource/predicates.go) to ensure that there is a provisioner defined for the class referenced by the claim. If the claim contains no reference to a class, the controller will not act on the claim.
+They would then create a `MySQLInstancePolicy` that referenced this class:


In most "real" use-cases, the administrator would most likely create a Namespace next, because they would probably want to create their things in a namespace which is not crossplane-system.

Good point! I will incorporate that thinking here.

suskin · 2019-09-05T01:02:16Z

design/one-pager-default-resource-class.md

+claims. Previously, claims could reference a generic `ResourceClass` by `name`
+and `namespace`, and could be moved across providers as long as there existed a
+class with that `name`. Now, claims must omit a `classRef` and rely on the
+existence of a `Policy` to achieve portability.


Did we already define the term portable and its variants? I am realizing this would probably be good to define, in case any readers are not sure what we mean by it.

I don't believe "portable" is defined anywhere in this doc, and I agree that it is important to do so.

suskin · 2019-09-05T01:15:50Z

design/one-pager-default-resource-class.md

+   [`Policy`](https://github.com/crossplaneio/crossplane-runtime/blob/e4d61ee2805af680baf16fc2a1d8f79538d0f9bb/apis/core/v1alpha1/resource.go#L93),
+   then bumping the dependency in core Crossplane.
+1. Remove the `classRef` field from all claims entirely. This would require the
+   same steps as above but with


It seems as though the motivation for removing classRef is to force claims to always be portable (in the sense that there will never be references to a strongly-typed Thing which is provider-specific). If this is the case, I would suggest that we also state that as one of the goals, instead of simply saying we want the level of portability that we had previously.

Here is the same thing as a question: for the stated goals, do we really need to remove the classRef field? What is the motivation for doing it this way instead of just adding the support for the policy ref?

I had this same question, but @prasek had some good arguments as to why it should be removed. I will let him provide further context.

What is the motivation for doing it this way instead of just adding the support for the policy ref?

One consideration is that if we support both a policyRef and a classRef we support two ways to specify resource class a claim should use; directly or via a policyRef. This is not necessarily an argument for removing classRef (though my intuition is that we should), but if we did keep it we'd have to be able to clearly communicate why we support setting the classRef two ways, which way is preferred in which situations, and what happens if both are specified.

suskin · 2019-09-05T01:18:54Z

design/one-pager-default-resource-class.md

+   claims that have a `policyRef` to a `Policy` that has a `classRef` of that
+   kind's strongly-typed class. For instance, the `RDSInstance` claim controller
+   will only reconcile `MySQLInstance` claims that have `policyRef` to a
+   `MySQLInstancePolicy` that has a `classRef` to an `RDSInstanceClass`. This


Do we imagine the mapping of RDSInstance to RDSInstanceClass as a generated/dynamic mapping?

https://godoc.org/github.com/crossplaneio/crossplane-runtime/pkg/resource#NewClaimReconciler

This mapping is currently specified when instantiating each resource claim controller's generic reconciler. I imagine under this proposal we'd also have to specify which which policy kind the reconciler should use. We could do so using the same pattern, but might want to rethink it since the NewClaimReconciler signature is getting fairly long.

suskin · 2019-09-05T01:19:56Z

design/one-pager-default-resource-class.md

+   [`ResourceClaimSpec`](https://github.com/crossplaneio/crossplane-runtime/blob/e4d61ee2805af680baf16fc2a1d8f79538d0f9bb/apis/core/v1alpha1/resource.go#L49).
+1. Alter all claim controllers (which live in provider stacks) to only reconcile
+   claims that have a `policyRef` to a `Policy` that has a `classRef` of that
+   kind's strongly-typed class. For instance, the `RDSInstance` claim controller


What are the performance considerations to this approach, if any? I am having trouble imagining an efficient way to do this lookup without having every controller inspect every claim and its policy.

To be totally honest, I am not quite sure of potential performance implications. However, I imagine a simple object reference lookup would be relatively inexpensive. This would obviously grow with number of claim reconcilers introduced though.

As @hasheddan implies, we instantiate a resource claim controller for each (resource claim, managed resource) tuple. So there's a resource claim controller for (MySQLInstance, RDSInstance), (KubernetesCluster, GKECluster), etc. This allows resource claim controllers to live in cloud provider stacks, making it easy for new cloud providers to add managed resources that satisfy the set of portable resource claims defined by Crossplane core.

Before we introduced strongly typed resource classes the generic ResourceClass kind contained a .provisioner string that would specify the kind of managed resource that should be used to satisfy any claim referencing the class. This field read like provisioner: rdsinstance.database.aws.crossplane.io/v1alpha1. This meant that the resource claim controllers determined which claims they should operate on by:

Watching for changes to all claims of the kind they were concerned with.

Using a watch predicate function to filter out claims that did not reference a ResourceClass whose provisioner mapped to the managed resource kind they were concerned with.

The predicate function mentioned in step 2 of the above process used a controller-runtime client.Client to Get the referenced ResourceClass. So every time a resource claim changed in the Kubernetes API, one controller for each managed resource kind that could possibly satisfy that claim would Get its ResourceClass to check whether they should. This sounds pretty bad, but fortunately the default client.Client instance is backed by a cache of API objects so we were not likely to make an actual API call for each controller. We managed to cut out this client.Client when we switched to strongly typed resource classes, instead using the HasClassReferenceKind predicate.

I imagine we'd return to a similar predicate under this proposal, only we'd Get and inspect the relevant policy instead. For example the (MySQLInstance, RDSInstance) resource claim controller would:

Watch for changes to all MySQLInstance resource claims.

Use a watch predicate to filter out claims that did not reference a MySQLInstancePolicy whose classRef was of kind: RDSInstanceClass. This would involve a Get, but frequently hit the cache as described above.

suskin · 2019-09-05T01:23:49Z

design/one-pager-default-resource-class.md

+
+This feature is similar to the original default class model in that it uses
+labels to specify which class of service to use as default when a `policyRef` is
+omitted. For each `namespace`, there must be only one policy that has the


there must be only one policy

How do we expect to enforce this? Is there a way to enforce this at the time the policy is created?

Related: what is the reason for this restriction? What if I want to round-robin my created resources between multiple configurations? This is sounding more like a Best Practice than a Hard Requirement to me.

Right now this requirement is because we would have no way of knowing which policy to default to if multiple were defined as default. I definitely see a more intelligent and varied defaulting pattern in the future. The first iteration could involve a weight label or something of that nature where the highest weighted policy in the namespaces could be applied. However, I am not sure that this is in scope for this iteration. Would love to hear other thoughts on this.

cc @prasek @negz

The only way I'm aware to enforce this would be to use an admission webhook (or rather a series thereof) that rejected new policies that were labelled default for any namespace that already contained the same kind of policy labelled such. Crossplane currently has no webhooks, and likely won't for 0.3, so at least at first we'd probably just have to make it clear that the default policy is undefined when multiple policies are labelled default.

suskin · 2019-09-05T01:26:37Z

design/one-pager-default-resource-class.md

+resource classes in the same `Policy` and picking one based on specific
+parameters defined in the claim.
+
+It is also likely that strongly-typed claims could be introduced at the provider


strongly-typed claims could be introduced

How are we imagining this working? Is this talking about using something like bringing back the classRef field in a claim, or is it talking about something like a new claim type which only works with a single provider?

The latter. I think it is definitely possible that we will want non-portable resources to still be able to be dynamically provisioned.

negz

This proposal seems sound to me. Thanks @hasheddan and @prasek for working through this!

negz · 2019-09-05T04:21:25Z

design/one-pager-default-resource-class.md

@@ -36,10 +55,12 @@ providerRef:
 reclaimPolicy: Delete
 ```

-This class would likely be created by an operator as a type of database that developers may deploy as part of their application. Currently, for a developer to deploy an RDS instance on AWS, they would have to explicitly reference it:
+This class would likely be created by an operator as a type of database that
+developers may deploy as part of their application. Currently, for a developer


Suggested change

developers may deploy as part of their application. Currently, for a developer

developers may deploy as part of their application. Originally, for a developer

negz · 2019-09-05T04:45:46Z

design/one-pager-default-resource-class.md

+`ResourceClass` that specified
+`mysqlinstance.database.crossplane.io/default:true`, there were now many
+different class `kinds` that the claim could potentially reference (i.e. GCP
+`CloudsqlInstanceClass`, AWS `RDSInstanceClass`, etc.). `Policies` were


I'm a strong believer in including this kind of context in design documents. I like to include it in a "background" section ahead of any goals or proposals, presuming that my audience has limited familiarity with the topic at hand. Even folks who know the project well might be rusty on on the deep technical subject matter, so it helps you to help them reach the same conclusion you have by telling a story.

negz · 2019-09-05T04:46:14Z

design/one-pager-default-resource-class.md

+classes](./one-pager-strongly-typed-class.md), the generic `ResourceClass`
+became obsolete and the `Policy` kind was introduced. Each claim kind had a
+corresponding policy kind (i.e. `MySQLInstancePolicy` for `MySQLInstance`,
+etc.). Claims controllers could no longer simply omit a `classRef` because their


Suggested change

etc.). Claims controllers could no longer simply omit a `classRef` because their

etc.). Claims could no longer simply omit a `classRef` because their

negz · 2019-09-05T04:50:03Z

design/one-pager-default-resource-class.md

 ```

-And a resource claim that defaulted to the resource class above could look like this in the general case:
+To continue to provide the same level of portability for claims that was


Everything from the Proposal: Default Class Reference v2 & Claim Portability heading to this point of the document is more continued background than a proposal. I suggest moving it under some kind of "background" or "problem" heading.

negz · 2019-09-05T04:54:55Z

design/one-pager-default-resource-class.md

+The `Policy` method continues to enable default class references, but
+strongly-typed resource classes introduce a reduction in portability of resource
+claims. Previously, claims could reference a generic `ResourceClass` by `name`
+and `namespace`, and could be moved across providers as long as there existed a


"Moved across providers" reads a little like a satisfied claim could be moved from one provider to another, suggesting that perhaps Crossplane could move your MySQLInstance from AWS to Azure which it sadly cannot.

Perhaps "could be satisfied by a compatible managed resource regardless of cloud provider"?

negz · 2019-09-05T05:05:56Z

design/one-pager-default-resource-class.md

+   [`Policy`](https://github.com/crossplaneio/crossplane-runtime/blob/e4d61ee2805af680baf16fc2a1d8f79538d0f9bb/apis/core/v1alpha1/resource.go#L93),
+   then bumping the dependency in core Crossplane.
+1. Remove the `classRef` field from all claims entirely. This would require the
+   same steps as above but with


What is the motivation for doing it this way instead of just adding the support for the policy ref?

One consideration is that if we support both a policyRef and a classRef we support two ways to specify resource class a claim should use; directly or via a policyRef. This is not necessarily an argument for removing classRef (though my intuition is that we should), but if we did keep it we'd have to be able to clearly communicate why we support setting the classRef two ways, which way is preferred in which situations, and what happens if both are specified.

negz · 2019-09-05T06:03:12Z

design/one-pager-default-resource-class.md

+   [`ResourceClaimSpec`](https://github.com/crossplaneio/crossplane-runtime/blob/e4d61ee2805af680baf16fc2a1d8f79538d0f9bb/apis/core/v1alpha1/resource.go#L49).
+1. Alter all claim controllers (which live in provider stacks) to only reconcile
+   claims that have a `policyRef` to a `Policy` that has a `classRef` of that
+   kind's strongly-typed class. For instance, the `RDSInstance` claim controller


As @hasheddan implies, we instantiate a resource claim controller for each (resource claim, managed resource) tuple. So there's a resource claim controller for (MySQLInstance, RDSInstance), (KubernetesCluster, GKECluster), etc. This allows resource claim controllers to live in cloud provider stacks, making it easy for new cloud providers to add managed resources that satisfy the set of portable resource claims defined by Crossplane core.

Before we introduced strongly typed resource classes the generic ResourceClass kind contained a .provisioner string that would specify the kind of managed resource that should be used to satisfy any claim referencing the class. This field read like provisioner: rdsinstance.database.aws.crossplane.io/v1alpha1. This meant that the resource claim controllers determined which claims they should operate on by:

Watching for changes to all claims of the kind they were concerned with.

Using a watch predicate function to filter out claims that did not reference a ResourceClass whose provisioner mapped to the managed resource kind they were concerned with.

The predicate function mentioned in step 2 of the above process used a controller-runtime client.Client to Get the referenced ResourceClass. So every time a resource claim changed in the Kubernetes API, one controller for each managed resource kind that could possibly satisfy that claim would Get its ResourceClass to check whether they should. This sounds pretty bad, but fortunately the default client.Client instance is backed by a cache of API objects so we were not likely to make an actual API call for each controller. We managed to cut out this client.Client when we switched to strongly typed resource classes, instead using the HasClassReferenceKind predicate.

I imagine we'd return to a similar predicate under this proposal, only we'd Get and inspect the relevant policy instead. For example the (MySQLInstance, RDSInstance) resource claim controller would:

Watch for changes to all MySQLInstance resource claims.

Use a watch predicate to filter out claims that did not reference a MySQLInstancePolicy whose classRef was of kind: RDSInstanceClass. This would involve a Get, but frequently hit the cache as described above.

negz · 2019-09-05T06:10:29Z

design/one-pager-default-resource-class.md

+   claims that have a `policyRef` to a `Policy` that has a `classRef` of that
+   kind's strongly-typed class. For instance, the `RDSInstance` claim controller
+   will only reconcile `MySQLInstance` claims that have `policyRef` to a
+   `MySQLInstancePolicy` that has a `classRef` to an `RDSInstanceClass`. This


https://godoc.org/github.com/crossplaneio/crossplane-runtime/pkg/resource#NewClaimReconciler

This mapping is currently specified when instantiating each resource claim controller's generic reconciler. I imagine under this proposal we'd also have to specify which which policy kind the reconciler should use. We could do so using the same pattern, but might want to rethink it since the NewClaimReconciler signature is getting fairly long.

negz · 2019-09-05T06:11:51Z

design/one-pager-default-resource-class.md

+   as well as the actual logic of the [shared claim
+   reconciler](https://github.com/crossplaneio/crossplane-runtime/blob/e4d61ee2805af680baf16fc2a1d8f79538d0f9bb/pkg/resource/claim_reconciler.go#L281),
+   both in `crossplane-runtime`. It should require minimal, if any, updates to
+   the actual claim controllers in each of the provider stacks.


I believe the claim controllers will need to be updated to tell their generic claim reconcilers which policy kind they should use, but this should be a fairly mechanical, one line change for each controller.

negz · 2019-09-05T06:16:25Z

design/one-pager-default-resource-class.md

+
+This feature is similar to the original default class model in that it uses
+labels to specify which class of service to use as default when a `policyRef` is
+omitted. For each `namespace`, there must be only one policy that has the


The only way I'm aware to enforce this would be to use an admission webhook (or rather a series thereof) that rejected new policies that were labelled default for any namespace that already contained the same kind of policy labelled such. Crossplane currently has no webhooks, and likely won't for 0.3, so at least at first we'd probably just have to make it clear that the default policy is undefined when multiple policies are labelled default.

upbound-bot · 2019-09-05T13:58:29Z

negz

This looks like a solid direction to me. I'm approving, but please ensure all outstanding comments are address where relevant. We should follow up on @bassam's thoughts about possibly renaming the Policy type before merging.

prasek · 2019-09-06T06:29:25Z

LGTM.

Was talking with @bassam about naming and his concern was that Policies seemed to be playing the role that Resource Classes used to play before strongly-typed classes were introduced, and suggested that Policies should be called Classes.

If you consider that the Generic ResourceClass played 2 roles: (a) provided a named class of service and could be marked as the default class of service, and (b) provided a specTemplate that was moving from generic to strongly typed templates, then should we move from claim->policy->class to claim->class->template.

The claim->class->template naming would let us say “default resource classes” once again, and it feels natural to say a cluster admin would create a namespace for a team or project and populate it with the available classes of service that could be used in that namespace.

Having a future intelligent class-of-service could make sense as well and the claim->class->template names would likely map more closely to what they represent.

jbw976 · 2019-09-06T15:43:11Z

design/one-pager-default-resource-class.md

-  ...
+  policyRef:
+    name: standard-mysql
+    namespace: crossplane-system


currently in master, policies need to live in the same namespace as the claim. Is this a new ability in this PR as well, that policies can live in a different namespace from a claim that references them?

Now that policies can be referenced by the claim, they can reference a policy that exists in a different namespace (just like they can currently reference a class in a different namespace). If a policyRef was omitted, they would only be able to fall back on the policy specified as default in their namespace though.

I think this thread might be out of date now that we're planning to use portable classes rather than policies.

Given defaulting behaviour will be namespace scoped (i.e. we'll look for a portable class with label default="true" in the namespace of the claim) I suggest we consider requiring that portable classes live in the same namespace as the claims that reference them; i.e. we use a corev1.LocalObjectReference from claim to (portable) class.

As long as we use a corev1.ObjectReference from the portable class to its underlying non-portable class this still gives us flexibility on which namespace the non-portable class (and thus managed resource) are created in.

Consider the following flow:

Infrastructure operator creates infra namespace (e.g. prod).

Infrastructure operator creates a provider and some non-portable classes (e.g. RDSInstanceClass) in said namespace.

Infrastructure operator "publishes" the aforementioned non-portable classes to the app operator's namespace by creating portable some classes (e.g. MySQLInstanceClass) that reference the non-portable classes created in step 2, and labelling one of each kind as the default for said app namespace.

App operator lists the portable classes that are available in their app namespace, picks one, and creates a resource claim that references it.

jbw976 · 2019-09-06T15:51:52Z

@prasek @hasheddan I'm struggling a bit with the suggested claim -> class -> template naming, I think mostly because I liked the idea of Policies having more useful functionality in the future, such as ranges and constraints to help matching a class to a claim, and others.

if we went with claim -> class -> template, do you foresee us introducing the concept of policies again in the future when we'd actually have useful functionality to put in them? Or would that type of functionality fit OK on these "classes", instead of creating a new policy type of object?

hasheddan · 2019-09-06T15:59:33Z

@jbw976 it seems like it would make sense to have ranges and constraints in a "class". The naming is indicating that they represent a "class of service", and ranges and constraints would still make it a class of service IMO, just a more flexible / intelligent one. It is also important to note that this naming change would require significant updates and refactoring of docs.

prasek · 2019-09-06T16:03:34Z

+1 for intelligent classes of service in the future.

bassam · 2019-09-06T20:28:45Z

Stepping back a bit, I think we are trying to accomplish the following:

Claims need to reference a Class for a provisioning/binding to happen
Claims without a classRef should use a DefaultClass within a scope (namespace to start with)
Claims should reference compatible Classes
Portable Claims should not reference non-portable Classes

This solution described here (and also as a result of supporting strongly typed classes and defaulting) solves this by introducing a Policy construct that plays two roles:

Defines a default classRef for claims in a given scope (namespace)
Acts as a more portable indirection to a class.

I think doing both of these in a Policy object can be confusing. I would recommend that we separate the two. Specifically:

Rename Policy to DefaultClass and let it handle the defaulting of classRef within a scope only.
Introduce a portable Class that is similar to the concept of Class today but points at another classRef that is non-portable.

suskin · 2019-09-06T23:34:49Z

Policy construct that plays two roles:

Defines a default classRef for claims in a given scope (namespace)

Acts as a more portable indirection to a class.

I would suggest that a third role is also being discussed:

Match a claim with a class based on desired capabilities or characteristics of the class, in a portable way (for example, available memory).

I believe this is why people have been talking about "default" classes as more like policies; defaulting is just an edge case of matching. The edge case is: how does matching work if no capabilities were specified?

Because of this, I would also suggest that "match" or "resolve" could be a clearer term than "reference". When we say:

Claims need to reference a Class for a provisioning/binding to happen

it would probably be clearer to say that Claims need to match with (or resolve to) a Class for provisioning/binding to happen. The matched Class can be explicitly specified on the claim. Or, the desired Class can be described by listing capabilities, and a matchmaker (the Policy, using current terms) can match the claim with an actual class.

bassam · 2019-09-07T04:30:52Z

I've been thinking of the claim to class matching as a "scheduling" problem. Specifically you might need to match multiple claims and workloads all together to a specific set of classes.

Also seeing some of the slack discussion I wanted to clarify one thing in my proposal. I’m not suggesting that we re-add ResourceClass or even more generally a Class. Instead, I'm suggested that we continue with the strongly typed approach and create a portable class definition for each claim. So, add MySqlClass for the MySqlClaim. This is a consistent portable experience — every claim type point to exactly one class type. It might also enable us to go back to simple classNames instead of complex ObjectReferences classRef

upbound-bot · 2019-09-09T03:21:56Z

negz · 2019-09-09T07:25:12Z

"match" or "resolve" could be a clearer term than "reference".

@suskin The use of the term reference is due to Kubernetes API conventions. One object (the claim) specifies another (the class) via a reference. API conventions state that references should be named fooRef or fooName.

The idea of a claim "matching" (rather than referencing) a class would imply using a label selector, which typically implies the matcher can match with one or more matchees. This may be appropriate for the claim -> class relationship, but we'd need to determine what happens when a claim matched more than one class.

Perhaps it is worth diverging from Kubernetes conventions and terminology in this case, but as software within the Kubernetes ecosystem we should make that predictability tradeoff carefully.

negz · 2019-09-09T08:18:26Z

I've thought about this design over the weekend, and broadly agree with @bassam's proposal. I wrote up a use case from a perspective I understand well to help work through this:

As an infrastructure administrator I support various teams of app operators. I want to empower these app operators to make requests for infrastructure in a fashion that is portable across clouds. App operators request infrastructure using resource claims. The portable nature of these resource claims means they cannot be high fidelity representations of the external resources they represent; they suffer the “lowest common denominator” problem.

Resource classes are the solution to the lowest common denominator problem. They allow me to provide a set of configuration parameters, or managed resource "shapes", that app operators authoring resource claims may choose from. I’d like most app operators to choose a class like “mysql-large” or “kubernetes-prod” without concerning themselves with which managed resource kind or even which cloud will be used to satisfy their claim.

I expect my some of my app operators will have use cases that I have not considered. For example, I may not have predicted that one of my app operators has a need for very small SSD backed PostgreSQL databases. Because app operators can’t use portable resource claims to configure parameters like storage type and disk size, and because I don’t want my infrastructure team to block the creation of new resources, it’s important that my app operators be allowed to author and select their own resource classes.

While I want to enable my app operators to author their own resource classes, I also want to provide some constraints as to what they can choose. I’m concerned about VM cost so I don’t want to allow any Kubernetes cluster to have more than 20 nodes. My organisation gets deep discounts on Azure, so I don’t want to allow any resources to use GCP or AWS without prior discussion.

I like the terms "portable" and "non-portable" for user facing concerns, but I'll use the terms "resource claim aligned" (e.g. MySQLInstanceClass) and "managed resource aligned" (e.g. RDSInstanceClass) in this post because I think they're a little more explicit in this context.

I believe we should today:

Introduce resource claim aligned classes such as MySQLInstanceClass as a simple "pointer" type to managed resource aligned classes such as RDSInstanceClass.
Limit resource claims to referencing resource claim aligned classes. That is, a MySQLInstance classRef omits type metadata and inherently references a MySQLInstanceClass. A MySQLInstance author who wishes to reference an RDSInstance must reference (and possibly author) a MySQLInstanceClass that in turn points to their desired RDSInstanceClass. This limitation simplifies both our user's mental model and our codebase logic at the expense of a small amount of extra YAML.
Use a simple label, i.e. default="true" to specify the default resource class for any claim that omits a class reference. These default classes would always be resource claim aligned. Both classes and claims are namespaced, so this implies resource classes would be configured in the "app" namespace, not the "infrastructure" namespace where managed resources are created.

The idea behind the contemporary resource claim aligned "policy" kinds was that they might, in addition to specifying a default class, be used to constrain the parameters a resource claim could specify (via a class) in future. For example a MySQLInstancePolicy might have said "use the cheap RDSInstanceClass by default, and don't let anyone select a CloudSQLInstanceClass over 100GB". Conceptually I see "policy" as something imposed on app operators (e.g. resource claim authors) by infrastructure operators to constrain what they can do, so on reflection I don't think it makes sense for an app operator to be able to choose which policy applies to their resource claims. I think we should consider introducing managed resource aligned "policy" kinds in future in order to constrain the kinds and shapes of managed resources app operators can deploy. These policies would be specified in the namespace of the managed resources they apply to, not the namespace where claims are made.

Putting this all together looks something like the following YAML.

This first block pertains exclusively to infrastructure. It defines a namespace that will logically group production infrastructure managed resources, and some policies that restrict the shape of those resources. In organisations with a distinct team of infrastructure operators this block will almost certainly be authored by infrastructure operators:

---
# A namespace that will logically group "production" managed resources.
apiVersion: v1
kind: Namespace
metadata:
  name: production
---
# A GCP Provider with service account secret reference
apiVersion: gcp.crossplane.io/v1alpha1
kind: Provider
metadata:
  namespace: production
  name: coolorg
spec:
  credentialsSecretRef:
    name: coolorg-gcp
    key: credentials.json
  projectID: coolorg-prod
---
# A policy that limits CloudSQL instances to 100GB of storage in production.
apiVersion: v1alpha1
kind: CloudSQLInstancePolicy
metadata:
  namespace: production
  name: maximum-storage-size
constraints:
  storageSizeGB:
    max: 100
---
# A policy that prevents the creation of RDS instances in production.
apiVersion: v1alpha1
kind: RDSInstancePolicy
metadata:
  namespace: production
  name: disallow-rds
enabled: false

This second block pertains to an application. It creates a namespace to logically group the application's Kubernetes resources, and a claim and classes used to provision infrastructure. Whether the classes are authored by the app operators, the infrastructure operators, or both, depend on how liberal the organisation is with regard to allowing app operators to choose their own configuration.

---
# A namespace that will logically group the "coolapp" application.
apiVersion: v1
kind: Namespace
metadata:
  name: coolapp
---
apiVersion: database.gcp.crossplane.io/v1alpha1
kind: CloudSQLInstanceClass
metadata:
  namespace: coolapp
  name: coolapp-production
providerRef:
  # This reference, not the classRef, determines which namespace the managed
  # resource will be created in.
  namespace: production
  name: coolorg
specTemplate:
  storageSizeGB: 80
  # Pretend all the other required fields are set here...
---
apiVersion: database.crossplane.io/v1alpha1
kind: MySQLInstanceClass
metadata:
  namespace: coolapp
  name: coolapp-production
  labels:
    # This class is the default for any MySQLInstance that omits a classRef
    default: "true"
underlyingClassRef:
  # This reference is implicitly to the named kind in the same namespace.
  apiVersion: database.gcp.crossplane.io/v1alpha1
  kind: CloudSQLInstanceClass
  name: coolapp-production
---
apiVersion: database.crossplane.io/v1alpha1
kind: MySQLInstance
metadata:
  namespace: coolapp
  name: coolapp-production-a
spec:
  # This reference is implicitly to a MySQLInstanceClass in the same namespace.
  classRef:
    name: coolapp-production
  engineVersion: MYSQL_5_7

jbw976 · 2019-09-09T13:19:21Z

Solid write-up @negz, thanks for adding the detailed use case as well to keep this grounded.

I assume that the policy and constraints aspects of this design is not for v0.3. Is the key part of this design that is in scope for v0.3 simply the MySQLInstance -> MySQLInstanceClass -> Strongly typed class, as well as supporting a simple default: "true" label on portable classes (e.g. MySQLInstanceClass)?

hasheddan · 2019-09-09T13:43:09Z

I believe the document is currently aligned with @negz write-up. To address one comment:

Limit resource claims to referencing resource claim aligned classes.

This kind of restriction is implemented at the claim reconciler logic level. In the current proposal in this PR, I have proposed a flexible model with the shared claim reconciler that allows for claim controllers to either limit claims to a portable class instance (i.e. "claim-aligned class") or allow for references to strongly typed class instances (i.e. "managed-aligned class") directly. Referencing strongly-typed classes directly would not be employed in any stack at the moment, but it leaves the door open for strongly-typed claims in the future. Any thoughts on this model @jbw976 @negz ?

negz · 2019-09-09T19:30:54Z

I assume that the policy and constraints aspects of this design is not for v0.3.

Correct. I'm definitely not suggesting we implement policy and constraint support for 0.3.

Is the key part of this design that is in scope for v0.3 simply the MySQLInstance -> MySQLInstanceClass -> Strongly typed class, as well as supporting a simple default: "true" label on portable classes (e.g. MySQLInstanceClass)?

It is. I wrote my comment without reading the latest update of the one pager, but I see it sounds like I'm in agreement with what is currently proposed.

I think there are two ideas to track that do not need to be in scope for 0.3:

Creating managed resources in the namespace of the provider, not the namespace of the non-portable, managed resource aligned, class.
Introducing policy or constraint support.

I'll raise separate issues to track these.

negz · 2019-09-09T19:32:07Z

Any thoughts on this model @jbw976 @negz ?

@hasheddan I would strongly prefer to simplify the model such that a claim can only reference a portable class. I'll add comments to the doc to this effect.

negz · 2019-09-09T19:34:52Z

design/one-pager-default-resource-class.md

+* 2.0
+  * Updating document to reflect the current method of defaulting using
+    `Policies`
+  * Introducing `Policies` as a class of service definition


Should this now be portable classes?

negz · 2019-09-09T19:35:27Z

design/one-pager-default-resource-class.md

- Allow operators and administrators the opportunity to provide a well defined, sane default class of commonly used resources that developers commonly submit claims for within a team or organization
- Minimize the burden of determining acceptable resource claims to submit for approval to an operations team. The ability to fall back on the underlying resource that has been deemed acceptable reduces unnecessary workflow stoppages and side channel communication
- Provide an *optional* feature that does not necessarily have to be implemented within a team or organization
+* Allow for claim portability by using policies, much how the generic


Another reference to policies; should this be updated?

negz · 2019-09-09T19:47:26Z

design/one-pager-default-resource-class.md

+to define a "class of service" for a given claim `kind` by referencing a
+strongly-typed resource class instance. However, in the future, these portable
+classes will likely be expanded to define ranges and constraints for portable
+claim kinds that reference them.


After some reflection I personally don't think portable classes (or non-portable classes) are the right place to define ranges and constraints, so I suggest we think of portable classes as simply a pointer to a non-portable class.

In organisations with distinct infrastructure operator and application operators I expect infrastructure operators would be placing constraints on the parameters application operators could choose. I think this should be done separately from resource classes because I think it should be possible for an application operator to author their own resource classes, but still be subject to the constraints configured by the infrastructure operator. I discuss this in a little more detail in #743 (comment).

I would tend to agree. However, if others have differing thoughts, I don't think the work designated for v0.3 has to make this decision.

negz · 2019-09-09T19:48:39Z

design/one-pager-default-resource-class.md

+In order to reintroduce the ability to select a class of service in claim, the
+following steps can be taken:
+
+1. Change all `Policy` types to `Class` types and include a `classRef` field


Potentially bikeshedding, but I wonder whether this should be underlyingClassRef, nonPortableClassRef, or something similar, to clarify that it must point to a non portable class rather than another portable class.

negz · 2019-09-09T19:54:38Z

design/one-pager-default-resource-class.md

+   reconciler](https://github.com/crossplaneio/crossplane-runtime/blob/e4d61ee2805af680baf16fc2a1d8f79538d0f9bb/pkg/resource/claim_reconciler.go#L281)
+   in `crossplane-runtime`. It should require minimal updates to the actual
+   claim controllers in each of the provider stacks in order to indicate the
+   portable class `kind` that they should use (example below).


I suggest that we avoid designing for the non-portable (aka strongly typed) resource claim scenario for now; it may make sense to write a separate generic resource claim reconciler to handle non-portable resource claims instead of introducing complexity to the existing reconciler.

I believe it would make both our code and our user's mental model of how Crossplane resources plumb together simpler if we required that portable claims reference only portable classes, even in cases where a user explicitly wanted to reference a non-portable class.

negz · 2019-09-09T20:02:27Z

design/one-pager-default-resource-class.md

+    resource.ClaimKind(databasev1alpha1.MySQLInstanceGroupVersionKind),
+    resource.PortableClassKind(databasev1alpha1.MySQLInstanceClassGroupVersionKind),
+    resource.ClassKind(v1alpha1.RDSInstanceClassGroupVersionKind),
+    resource.ManagedKind(v1alpha1.RDSInstanceGroupVersionKind),


I think we could introduce a container type to keep the signature of NewClaimReconciler a little more manageable. For example:

type ClassKinds struct { Portable ClassKind NonPortable ClassKind }

As a side note, writing out "nonportable" feels kind of gross in code. I wish we had a better antonym for portable.

I like this container pattern. How would you feel about Strong? I don't love it but it is succinct and descriptive.

negz · 2019-09-09T21:05:25Z

design/one-pager-default-resource-class.md

-  ...
+  policyRef:
+    name: standard-mysql
+    namespace: crossplane-system


I think this thread might be out of date now that we're planning to use portable classes rather than policies.

Given defaulting behaviour will be namespace scoped (i.e. we'll look for a portable class with label default="true" in the namespace of the claim) I suggest we consider requiring that portable classes live in the same namespace as the claims that reference them; i.e. we use a corev1.LocalObjectReference from claim to (portable) class.

As long as we use a corev1.ObjectReference from the portable class to its underlying non-portable class this still gives us flexibility on which namespace the non-portable class (and thus managed resource) are created in.

Consider the following flow:

Infrastructure operator creates infra namespace (e.g. prod).

Infrastructure operator creates a provider and some non-portable classes (e.g. RDSInstanceClass) in said namespace.

Infrastructure operator "publishes" the aforementioned non-portable classes to the app operator's namespace by creating portable some classes (e.g. MySQLInstanceClass) that reference the non-portable classes created in step 2, and labelling one of each kind as the default for said app namespace.

App operator lists the portable classes that are available in their app namespace, picks one, and creates a resource claim that references it.

upbound-bot · 2019-09-10T05:27:01Z

upbound-bot · 2019-09-10T05:34:29Z

design/one-pager-default-resource-class.md

prasek

Looks good, thanks for the quick turn on this!

prasek · 2019-09-10T16:13:25Z

design/one-pager-default-resource-class.md

- Provide an *optional* feature that does not necessarily have to be implemented within a team or organization
+* Allow for claim portability by using portable classes, much how the generic
+  `ResourceClass` operated
+* Enable claims with no reference 


... with no class reference

prasek · 2019-09-10T16:19:33Z

design/one-pager-default-resource-class.md

 ```

-And a resource claim that defaulted to the resource class above could look like this in the general case:
+A claim referencing a portable class will now look as follows:


Really like this, much simpler for the consumer.

prasek · 2019-09-10T16:22:32Z

design/one-pager-default-resource-class.md

 ```

-Or in the more specific case:
+With an arbitrary number of potential portable `MySQLInstanceClass` objects that


This line makes it seem like the claim above could map to an arbitrary number of potential portable class object, when it's classRef.name is standard-mysql so it will only map to a single portable class object in a given environment.

Suggest rewording to something like this: "A portable class can be referenced by name, and the portable class definition can be customized to use the cloud services a given application or team wants to use, e.g. RDS in one environment and CloudSQL in another."

Updated in most recent commit. I didn't use the exact wording but I think I communicated the same sentiment.

prasek · 2019-09-10T16:25:32Z

design/one-pager-default-resource-class.md

+  namespace: crossplane-system
+```
+
+### Denote Default via Label


Really like the use of this simple label to mark the default.

upbound-bot · 2019-09-10T16:36:57Z

bassam

Looking great. Thanks for the collaborative effort on this one. Small question remaining for me above.

bassam · 2019-09-10T16:46:52Z

design/one-pager-default-resource-class.md

 spec:
-  ...
+  classRef:
+    name: standard-mysql


can this become className: instead of classRef: name:

Hmm, this may be tricky. It would involve changing the embedded ResourceClaimSpec in crossplane-runtime which embeds corev1.ObjectReference. We are going to be changing this to corev1.LocalObjectReference, which only has a name field. If we were to make this inline, the field would just be called name, not className. I am sure we could do some trickery to have a string field called className then create a LocalObjectReference from it, but the Spec would be less familiar to Kubernetes users and we would not be following common practice.

I don't think it should block the design being merged, but wold be great to avoid naming the group/version given the 1:1 mapping between claim and class in this design.

Oh yes, it will only accept name as it will now be LocalObjectReference

Coming in late here, but either className or classRef using a LocalObjectReference (with only a name field) seem to be blessed by the Kubernetes API conventions. I think LocalObjectReference is a little more consistent with the rest of our API (we don't currently have any fooName fields), but we would benefit from a general API overhaul in this area; we have plenty of LocalObjectReference types that could be simple fooName fields, and I believe we may have a few ObjectReference fields that could be LocalObjectReference.

Signed-off-by: hasheddan <georgedanielmangum@gmail.com>

upbound-bot · 2019-09-10T17:45:33Z

hasheddan requested review from jbw976, negz, prasek and suskin September 4, 2019 05:17

hasheddan force-pushed the multi-policy-design branch from 787f2dd to d7c1e4f Compare September 4, 2019 05:18

suskin reviewed Sep 5, 2019

View reviewed changes

negz reviewed Sep 5, 2019

View reviewed changes

hasheddan requested review from negz and suskin September 5, 2019 13:51

negz approved these changes Sep 6, 2019

View reviewed changes

jbw976 reviewed Sep 6, 2019

View reviewed changes

hasheddan changed the title ~~Claim portability using Policies~~ Claim portability using portable classes Sep 9, 2019

negz reviewed Sep 9, 2019

View reviewed changes

hasheddan force-pushed the multi-policy-design branch from 025f74d to c54a600 Compare September 10, 2019 05:21

prasek reviewed Sep 10, 2019

View reviewed changes

design/one-pager-default-resource-class.md Outdated Show resolved Hide resolved

prasek approved these changes Sep 10, 2019

View reviewed changes

bassam requested changes Sep 10, 2019

View reviewed changes

portable classes design

9013b9a

Signed-off-by: hasheddan <georgedanielmangum@gmail.com>

hasheddan force-pushed the multi-policy-design branch from d03b084 to 9013b9a Compare September 10, 2019 17:38

hasheddan requested review from bassam and jbw976 September 10, 2019 17:39

bassam approved these changes Sep 10, 2019

View reviewed changes

jbw976 approved these changes Sep 10, 2019

View reviewed changes

jbw976 merged commit c12d116 into crossplane:master Sep 10, 2019

hasheddan mentioned this pull request Sep 11, 2019

Implement portable classes crossplane/crossplane-runtime#13

Merged

4 tasks


		Currently, each Crossplane resource kind (i.e. GKE Cluster, AWS S3 Bucket, etc.) has a controller that reconciles claims for that resource by binding them to the corresponding managed type. These controllers use [predicates](https://github.com/crossplaneio/crossplane/blob/master/pkg/resource/predicates.go) to ensure that there is a provisioner defined for the class referenced by the claim. If the claim contains no reference to a class, the controller will not act on the claim.
		They would then create a `MySQLInstancePolicy` that referenced this class:

	developers may deploy as part of their application. Currently, for a developer
	developers may deploy as part of their application. Originally, for a developer

	etc.). Claims controllers could no longer simply omit a `classRef` because their
	etc.). Claims could no longer simply omit a `classRef` because their

Claim portability using portable classes #743

Claim portability using portable classes #743

Conversation

hasheddan commented Sep 4, 2019

Checklist

upbound-bot commented Sep 4, 2019

upbound-bot commented Sep 4, 2019

suskin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

negz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

upbound-bot commented Sep 5, 2019

negz left a comment

Choose a reason for hiding this comment

prasek commented Sep 6, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbw976 commented Sep 6, 2019

hasheddan commented Sep 6, 2019

prasek commented Sep 6, 2019

bassam commented Sep 6, 2019 • edited Loading

suskin commented Sep 6, 2019

bassam commented Sep 7, 2019

upbound-bot commented Sep 9, 2019

negz commented Sep 9, 2019

negz commented Sep 9, 2019 • edited Loading

jbw976 commented Sep 9, 2019

hasheddan commented Sep 9, 2019

negz commented Sep 9, 2019

negz commented Sep 9, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

upbound-bot commented Sep 10, 2019

upbound-bot commented Sep 10, 2019

prasek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

prasek commented Sep 6, 2019 •

edited

Loading

bassam commented Sep 6, 2019 •

edited

Loading

negz commented Sep 9, 2019 •

edited

Loading