-
Couldn't load subscription status.
- Fork 1.8k
Proposal Doc: Composable integration in Operator-SDK #2340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Using the Composable SDK means that the CRD needs to have permission to access objects that are being referenced. | ||
| This means that the operator developer needs to consider what kinds of objects will be permitted for references and | ||
| add appropriate permissions for the RBAC rules of the CRD. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @vazirim,
Really thank you for share your ideas and contribute with, first at all 🥇.
I have a few questions/concerns which follow. Please let me know if I misunderstood something.
So, with this solution, in a CR of a CRD for example ( Memcached ), we can map many references for pre-exisent resources in the cluster and then, it will be used to find the data by the ResolveObject. Following code from the tutorial just to clarify my questions :-)
// Resolve the memcached instance
resolved := &cachev1alpha1.MemcachedResolved{}
comperr := sdk.ResolveObject(r.client, r.config, memcached, resolved, request.Namespace)
if comperr != nil {
if comperr.ShouldBeReturned {
return reconcile.Result{}, comperr.Error
}
reqLogger.Info("Error during resolve, but not returned", comperr.Error)
return reconcile.Result{}, nil
}- How it will deal with error scenarios, for example, if 3 of 10 resources mapped for Memcached cannot be found? Will the Memcached resource be created or not? Has it a way to tell that some ref is optional?
- If the object, for example, a ConfigMap be found but it has not the key mapped in the CR how will it work?
- How would be the performance of it if the user adds too many resources refs to be "resolved"?
- How to validate the data in the CR? Has it already something implemented for that?
- Also, I think that would not be possible to call forever the reconcile in the case of something nothing works.
- And, in real scenarios, I am afraid that would not possible to solve it by changing the options in the manager to re-trigger the reconcile every few seconds as in the demo. It would need to have a watch/observer impl for each resource referenced.
WDYT folks @estroz @joelanford @shawn-hurley ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add one more, which is should there be some way to specify a default if it can not be resolved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @camilamacedo86,
Thanks for your feedback!
-
If 3 out of 10 resources cannot be found, then ResolveObject returns an error. It either resolves it all, or nothing. It also employs caching to get a consistent view. The cache is made fresh for each invocation of ResolveObject.
-
If the object is found, but the data in it doesn't exist, then ResolveObject also returns an error.
-
The performance hasn't been an issue so far because we have only seen CRs with a handful of references and no more. Resolving those references amounts to looking up in etcd. In any case, caching also alleviates performance.
-
We have a validation that happens inside the Composable operator controller (so that the underlying template is well-formed). For the Composable-SDK, since the developer is modifying the CRD definition, and the open-api spec of the CRD, we get validation through that.
-
The decision to retry is up to the consumer of the composable-SDK. But any operator will have a similar problem -- you need to decide for various error scenarios whether it makes sense to requeue another Reconcile cycle or not. So this is not specific to composable-SDK.
-
Sure, having watchers works as well. In the operators we have developed, we like to resync periodically to check for the health of what the operator is managing.
--Mandana
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shawn-hurley, yes we had thought of having defaults in the context of the Composable operator because it helped with validation (we could do a dry-run deployment). But we opted for not having defaults. The idea being that we want to have the ability of resolving data-dependencies dynamically. So if a resources has not been created yet, we retry until it's online. We have found this use case to be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shawn-hurley The latest version of the Composable SDK (v.0.1.4) has support for default values.
| The Composable SDK offers the following function for resolving the value of cross-resource references. | ||
|
|
||
| ```golang | ||
| func ResolveObject(r client.Client, config *rest.Config, object interface{}, resolved interface{}, composableNamespace string) *ComposableError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does ResolveObject() use the rest.Config parameter for? I'm curious if we can get by passing in just the client.Client?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The config is used to get a CachedDiscoveryInterface (k8s.io/client-go/discovery), which is used to lookup a resource kind given a kind and an apiversion. This in turn gives information that is used to lookup the reference object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, it means that it will not respect the operator scope. For example, if we pass just an ns for the manager here it would be looking for in all ns of the cluster. Am I right?
If yes, IHMO it is something that needs to be changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the object reference have no namespace specified, then it will look only in the namespace of the manager. This gets passed as the composableNamespace parameter to ResolveObject(). That's the default namespace to use.
If the object reference has a namespace specified, it will look there instead.
The Config is used to get a CachedDiscoveryInterface and all we need to know is to see if the kind of the object reference is namespaced or not. This will inform how to create a NamespacedName for object lookup. If it is namespaced, then the convention above is used to determine the namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @vazirim,
Tks for the clarifications.
IHMO in order to the Compasable work aligned with it would need to use the manager/cache which is provided instead of a new cache. Otherwise, shows that It would allow for example, besides the operator be set up in the manager/cache with a List of namespaces looking for the resource out of this scope. Also, it can have more options setup by default which IHMO should be respected in all project. Is it make sense?
PS.: The manager can be set up with some options. Regards the ns it can be "" which will allow cache all cluster, or a list of namespaces or just 1 namespace. See: https://godoc.org/sigs.k8s.io/controller-runtime/pkg/manager#Options
Please, let me know if I misunderstood something as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ResolveObject is indeed using the client provided by the controller:
func ResolveObject(r client.Client, config *rest.Config, object interface{}, resolved interface{}, composableNamespace string) *ComposableError {
The CachedDiscoveryInterface is obtained from the controller's client as well (the client passed into the ResolveObject)...
I am not sure I understand the concern (?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation @vazirim! I see now why the rest.Config is important. But now I have a follow-up. :)
Have you considered making a ResolvesObject interface:
For example,
type ResolvesObject interface {
// ResolveObject resolves object references. It uses a context
// for cancellation.
ResolveObject(ctx context.Context, in, out interface{}) error
}And a struct that implements it:
type KubernetesResourceResolver struct {
defaultNamespace string
client client.Client
resourcesClient discovery.ServerResourcesInterface
}With this, the resolver could be setup and included as a field in the reconciler implementation, it would simplify the user-facing API, and it would make it more extensible for other possible future use cases (and writing unit tests).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joelanford The latest version of the Composable SDK (v0.1.4) now has an interface like you have suggested. In addition to the ResolveObject interface, we have the struct that implements it:
type KubernetesResourceResolver struct {
Client client.Client
ResourcesClient discovery.ServerResourcesInterface
}
We simplified and got rid of the default namespace here.
The resolveObject now has this interface:
func (k KubernetesResourceResolver) ResolveObject(ctx context.Context, object interface{}, resolved interface{}) error
The assumption is that the object passed has a namespace, and that serves as the default namespace. We also got rid of ComposableError and added functions to determine the nature of the error (see below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assumption is that the object passed has a namespace
Forgive me if this has already been answered or addressed. Does this mean that cluster-scoped resources are not supported? Or can I pass an object where namespace is empty and that means cluster-scoped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cluster-scoped resources are supported as well. We check to see if the reference is to a cluster-scoped resource or not. If not, and if the namespace is omitted from the GetValueFrom reference, then we use the default namespace, which is the namespace of the object itself (that contains the reference).
|
|
||
| The `ResolveObject` function uses caching for looking up object kinds, as well as for the objects themselves, in order | ||
| to ensure that a consistent view of the data is obtained. If any data is not available at the time of the lookup, | ||
| it returns an error. So this function either resolves the entire object or it doesn't -- there are no partial results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's at least one interesting scenario that operator developer's using these object references would need to consider, and that is that the object resolves successfully and is created, and then at some point later, one of the referenced objects in a GetValueFrom field is deleted.
In that case, it looks like ResolveObject() will fail, but the caller may want to have some custom logic to handle the specific reason for the failure.
For example, if the error was a temporary connection issue, a caller would want to retry. But if the error is that a referenced object is not found, I probably still want to start retrying, but I might also want to delete the other objects that I created during a previous reconciliation that relied the resolved object that is now no longer resolving. Does that make sense?
Does the returned ComposableError.Error field have enough detail to allow this sort of decision making in the calling code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently ComposableError does not have this level of detail.
| } | ||
| ``` | ||
|
|
||
| The type `ComposableError` contains the error, if any, and a boolean `ShouldBeReturned` to indicate whether the calling reconciler should return this error or not. This is used to distinguish errors that are minor and could be fixed by immediately retrying from errors that may indicate a stronger failure, such as an ill-formed yaml (in which case there is no need to retry right away). If there are no errors, then the `ResolveObject` returns nil. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given my previous comment, it seems like there could be a gray area for deciding how ShouldBeReturned should be set. I wonder if it would make sense to expose a separate helper function in the composable SDK to handle this such that ResolveObject() could return the error directly.
For example:
func IsUnresolvableError(err error) bool { ... }err := composable.ResolveObject( ... )
if composable.IsUnresolvableError(err) {
return reconcile.Result{}, err
} else if apierrors.IsNotFound(err) {
// do some extra logic
} else {
return reconcile.Result{}, err
}This way callers could use composable's helper function OR define their own custom error handling, and it would get error handling logic out of ResolveObject.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it's a good idea to separate this concern, and add more detail to the kinds of error situations that can happen. Will open a work item.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joelanford The latest version of Composable SDK (v0.1.4) got rid of ComposableError. We now have a series of functions as you suggested:
func IsIllFormedRef(err error) bool
func IsKindNotFound(err error) bool
func IsObjectNotFound(err error) bool
func IsValueNotFound(err error) bool
func IsRefNotFound(err error) bool
IsIllFormedRef means there is a problem with the way the cross-resource reference has been written in the yaml. IsKindNotFound means the kind for the reference does not exist. IsObjectNotFound means the object does not exist, and IsValueNotFound means the value does not exist within the object. The last one simply means either IsKindNotFound, or IsObjectNotFound, or IsValueNotFound is true.
| Rather than hardwiring the same value everywhere, I can define cross-resource references to a unique configmap that contains | ||
| the data. This makes it easier to change the data if needed, and can allow all resources to be configured | ||
| dynamically (i.e. changes in the configmap are absorbed and resource that point to it are updated). | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same need could be solved by just GET object in the controller and then, updating the CR resource with the values. Not necessary is required to add all values used in the CR. It can be obtained programmatically, dynamically and in realtime as well.
IHMO the user's stories needs to be improved because the idea here would not be allowed something that is not possible today. It is just to add a facility, I mean another way to get RefFrom without need to implement it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that rather than hardwiring that in the controller, each CR of the CRD could have fields that point to different objects. So this gives a lot of flexibility.
|
|
||
| To add support for cross-resource references in CRD definitions and controllers (Composable SDK) and make this | ||
| functionality available through the Operator-SDK. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is missing: How it would be added in the SDK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would add the types that are here:
https://github.com/IBM/composable/blob/master/sdk/composable-types.go
so that operator developers can use the ObjectRef to specify that fields have this type.
We also add the method ResolveObject (possibly with a modified interface so that the receiver is the client instead of the first argument) as one more way in which the developer can access objects in etcd (like Get, Update etc...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think would be nice to have your thoughts on how it should be added to the project too in the doc. Really tks for all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok will add that.
Thanks for all the feedback!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@camilamacedo86 I changed the proposal doc according to the changes above for the interface.
| ## Summary | ||
|
|
||
| Kubernetes object specifications often require constant values for their fields. When deploying an entire application | ||
| with many different resources, this limitation often results in the need for staged deployments, because some resources have to be deployed first in order to determine what data to provide for the specifications of dependent resources. This undermines the declarative nature of Kubernetes object specification and requires workflows, manual step-by-step instructions and/or brittle automated scripts for the deployment of applications as a whole. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does it mean for the referencing object and its lifecycle when the referenced object does not exist yet? Will it be created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmesser If the referenced object does not exist, then the referencing object's controller will keep retrying until that object is created. When that object is created, then the controller can resolve the referencing object and continue with whatever it needs to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting! How long would it keep trying?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That depends on the consumer of the library. In the memcached tutorial, this is done indefinitely.
We can imagine having a message in the Status of the object saying that it is waiting for some object to be created. If there is a bug in the spec of the object reference, the user could catch that and fix it. In any case, when the error is fixed or the object is created, then the controller can resolve the referencing object and move forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. @joelanford I think if we were to take this in, it would probably be a good default behavior to write out a status when waiting on referenced values.
Signed-off-by: Mandana Vaziri <mvaziri@us.ibm.com>
Signed-off-by: Mandana Vaziri <mvaziri@us.ibm.com>
|
I wonder if there are any plans for merging this proposal? |
|
@vazirim some thoughts on this contribution: The SDK should definitely make composable visible to operator authors, as it solves advanced use cases. However I'm concerned about the maturity of the composable operator and the community commitment required to support a potentially immature project in the SDK itself. My suggestion is to add the helpers proposed here to an operator-framework library and/or discuss composable usage in documentation. To further discuss how to proceed I'd like to nail down the specifics of contributing, with graduation criteria, in #2770. For now this PR should stay open while we figure out a contribution methodology. |
|
@estroz would it help if I did a code PR against operator-sdk? |
|
@vazirim not quite. To put what I said above differently: until we figure out a general framework for contributing code from external projects to operator-framework ones, neither this proposal nor its code should be merged. |
|
@vazirim Apologies for being MIA on this proposal and leaving it hanging so far! And thanks for popping it to the top of the stack. We haven't had a community proposal of this scale before so you're unfortunately the guinea pigs and we were unfortunately caught flat footed and unprepared when all of the questions were answered and the next step was to merge. 😳 🤣 In the coming weeks we will be putting together docs and guidelines for accepting community-contributed features of this scale and establish some governance and procedures. |
|
@joelanford, @estroz Thanks for the update! |
|
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
|
Hi @joelanford, this PR is going stale! |
|
/remove-lifecycle stale |
|
@vazirim I opened operator-framework/community#22. I should have linked that in this PR already! It adds some governance around projects in the operator-framework org. Our current thinking is to bring this (and other similar features) in as "incubating" projects with a separate repository to let them mature and build a community within operator framework before "graduating". I think we need to get the project governance agreed upon and merged, and then we can proceed with this as the first candidate for incubation. Sorry this is taking so long! |
|
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
|
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
|
Rotten issues close after 30d of inactivity. Reopen the issue by commenting /close |
|
@openshift-bot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Description of the change:
Proposal doc to integrate Composable SDK into Operator-SDK.
Motivation for the change:
Composable SDK allows operator developers to add cross-resource references to their CRDs,
enabling dynamic configuration.