-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mirror PersistentVolume ReclaimPolicy semantics #87
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this change, the behavior of ReclaimPolicy
is extended to cover the relation between claim and managed, too. IMO, this is a public interface change, which would require a version bump on CRDs. However, we know that this is not the final destination we want to arrive per #21 and #27 . At some point, we want to separate the external resource policy from reclaim policy, which would then again be a breaking change. So, I think we should just implement external policy as well and then do the bump in this release only once.
Side note; for v1alpha* resources this may not be as important but for v1beta1 resources I do think that behavior change requires a bump to v1beta2.
// Unbind the supplied Claim from the supplied Managed resource by removing the | ||
// managed resource's claim reference, transitioning it to binding phase | ||
// "Released", and if the managed resource's reclaim policy is "Delete", | ||
// deleting it. | ||
func (a *APIStatusBinder) Unbind(ctx context.Context, _ Claim, mg Managed) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should call this Unbind
anymore. Release
maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Release
might be a little better a name, but I think Unbind
is still decent and would thus prefer to avoid the API churn of renaming the method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I see that, but still I'd expect the resource to be in Unbound
state when I call Unbind
The Kubernetes API versioning guidance is not super explicit about this. For beta resources it states:
As opposed to alpha APIs, which state:
If a subsequent beta or stable release is interpreted to refer to the custom resource and not the software and we deem this semantic/behaviour change to be incompatible with the previous behaviour, then I agree that this warrants a version bump for beta resources.
I don't think this is true anymore. I'm much less confident these days that the Presuming we still want to mirror the persistent volume semantics, then we don't need to (or indeed want to) introduce a second setting that determines whether external resources are deleted along with their managed resources; that is covered by
I'm quite strongly convinced these days that Crossplane should always be the source of truth for the managed resources it manages. For the above two potential configuration settings this means:
In summary, I feel we should move forward with this PR and don't expect to change how |
I think this is indeed the case. Quoting from the versioning guideline:
If it was a new field, I'd say we don't need the bump. But it's clear to me that the user gets a different behavior now when they choose I feel like this is more of a product direction discussion rather than technical. I don't want to block this PR. Let's discuss it under #22 |
I am not sure about the fact that when reclaim policy is set to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am good with this change as an incremental step and feel as though it is in-line with the design proposed. I do think working on importing resources / using existing resources should be prioritized, but not tightly coupled to this change.
Note: this will also require a slight change to core Crossplane as the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about the fact that when reclaim policy is set to Retain, even though the resource is retained after claim deletion, it cannot be bound to another claim.
Discussed with @negz offline and now I think deferring this discussion for another time sounds like best path to move forward.
In short; I believe managed resources should be re-schedulable in some way. Things like volume or databases can be considered like one-application thing where it doesn't make a lot of sense for another application to take over but there are stateless cloud services and various scenarios where the user may want to re-use a resource after it's been released by a claim. In the future, we can introduce a ReclaimPolicy
option to cover this and doing so when users ask for it makes more sense.
The resource import story will be covered #22
// Unbind the supplied Claim from the supplied Managed resource by removing the | ||
// managed resource's claim reference, transitioning it to binding phase | ||
// "Released", and if the managed resource's reclaim policy is "Delete", | ||
// deleting it. | ||
func (a *APIStatusBinder) Unbind(ctx context.Context, _ Claim, mg Managed) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I see that, but still I'd expect the resource to be in Unbound
state when I call Unbind
A small clarification - we'd delete the managed resource CR by default, but not no matter what. Whether or not the managed resource was deleted was determined by the cascading deletion policy, so
We discussed this on a call today, but I want to capture my (somewhat rambling) thoughts here too. I believe the following to be true with or without the change proposed by this PR:
What changes with this PR is whether a managed resource can be claimed (which is distinct from used) multiple times serially, which brings us to philosophising about what is a claim? I see managed resources (and classes) as being "offerings" published by an infrastructure operator, and resource claims being just that - "claims" to one of those offerings by a particular entity. So an entity can claim a managed resource in a particular namespace, and that claim can be used by one or more entities within its namespace. At some point the entity who claimed the managed resource will not want it any more. I feel that a claim being deleted means "One or more entities were using this resource for a particular purpose or set of purposes. We no longer need this resource for that purpose or set of purposes." The question then is whether it's safe and sensible to offer the same underlying resource to another claimant, potentially in another namespace. I don't think it is a safe and sensible default to then allow another entity to claim that managed resource for a different set of purposes, given that by doing so they'll potentially gain access to any state that was left over by the previous claimant. Put otherwise - "a big database instance" is a managed resource, while "the billing database instance" is a resource claim. By sharing or reusing at the resource claim level you're allowing multiple entities to use "the billing database". By reusing at the managed resource level you're allowing multiple entries to use "a big database instance", without consideration of the fact that the big database used to be the billing database instance. Keep in mind the reclaim policy is set by the infrastructure operator, not the claimant, so the infrastructure operator is deciding what happens to it when it's done. The cases we support here are:
Persistent volumes support a deprecated |
This commit changes the meaning of the resource claim resource policy to match https://kubernetes.io/docs/concepts/storage/persistent-volumes/#reclaiming as closely as possible, minus the deprecated 'Recycle' policy. Previously the reclaim policy dictated only what happened to the external resource when its managed resource was deleted. Signed-off-by: Nic Cope <negz@rk0n.org>
I agree with this statement. It definitely shouldn't be the default. But I also see the value in making this possible. Opened #88 to track this discussion. |
Description of your changes
Fixes #21
This commit changes the meaning of the managed resource reclaim policy to match https://kubernetes.io/docs/concepts/storage/persistent-volumes/#reclaiming as closely as possible, minus the deprecated 'Recycle' policy. Previously the reclaim policy dictated only what happened to the external resource when its managed resource was deleted.
Note that the implementation here differs slightly from some of our discussions around the reclaim policy, but I believe that is due to our having misunderstood how the persistent volume reclaim policy that we wanted to mirror works. Based on my read of the above link, the behaviour I intend for this PR is:
Delete
policy are deleted when their bound resource claim is deleted.Delete
policy delete their external resource when they are deleted.Retain
policy are not deleted when their bound resource claim is deleted.Retain
policy do not delete their external resource when they are deleted.Released
(notUnbound
) when their resource claim is deleted.Released
managed resource cannot be reused; it sticks around only so the infrastructure operator can perform any cleanup tasks they need to before manually deleting it.Retain
, in that the absence of a reclaim policy is treated asRetain
.Delete
reclaim policy.Testing Checklist
Before merging this PR I intend to:
Retain
by dynamically provisioning a managed resource without a reclaim policy and observing that the managed resource's reclaim policy is unset.Delete
by dynamically provisioning a managed resource using a resource claim without a resource policy and observing that the managed resource's reclaim policy isDelete
.Retain
(or unset) are retained in binding phaseReleased
when their claim is deleted.Retain
(or unset) retain their external resources when deleted.Delete
are deleted when their claim is deleted.Delete
delete their external resources when deleted.Checklist
I have:
make reviewable
to ensure this PR is ready for review.clusterrole.yaml
to include any new types.