Semantics of destructive operations #82

mhausenblas · 2020-07-15T16:46:34Z

If an ACK user deletes a Kubernetes namespace that has, say, an S3 bucket custom resource in it, is the S3 bucket also deleted or not?

The answer to this question (and the respective UX) should follow the "no surprises" principle. IOW: opting into (forcing) destructive operations rather than silently cascading the delete from the Kubernetes cluster perimeter to AWS services.

donpinkster · 2020-08-20T06:33:00Z

In EC2 you can instances tag them as "unterminatable", so you have to remove a flag first before you can terminate EC2 instances. It would be great to have something like this for the ACK resources which have state in them.

For example RDS, if some rogue component, a wrong GitOps commit or whatever reason a delete event happens on the RDS object in the K8s API I would like to have an option to prevent the actual deletion of the associated resource in AWS.

mhausenblas · 2020-08-20T07:21:08Z

Thanks @donpinkster and indeed I was thinking along that line, potentially on the namespace level. There are a couple of questions like is it opt in or force deletion, but keep it coming, all of this should give us enough input for a design. Any volunteers? ;)

justinfiore · 2020-08-20T11:15:55Z

I agree. Deletion should either be a 2 step process or something that must explicitly be configured on the custom resource.
A 2-step process could be implemented in the following way:
Have a property on the resource like cascadeDelete which defaults to false.
You could then edit the resource, set that to true, then delete the resource.

Or if you didn't want a two-step process, just set that to true from the start.

I think the behavior should be the same across all resource types. Not just ones that are stateful, as that would be surprising behavior.

joberdick · 2020-08-20T11:35:02Z

I would make this an option on the controller side. It being per object would be good. Like maybe we want ECR Repos to be removed with the namespace, but not an s3 bucket.

jaypipes · 2020-08-20T12:19:51Z

@joberdick Yes, that's what I'm leaning towards as a long-term solution.

amitkarpe · 2020-08-21T05:33:19Z

We can use annotations or labels, which will declare whether this object is expected to delete or not.

For example, helm uses "annotations" like:
helm.sh/resource-policy: keep

For example, skaffold uses "labels" like:
skaffold.dev/cleanup: "false"

szuecs · 2020-08-22T11:43:49Z

If you have an annotation, how do you solve if you have 2 references with different annotation value?

In general I agree don’t delete data just like that.

ravishtiwari · 2020-08-24T17:46:10Z

If an ACK user deletes a Kubernetes namespace that has, say, an S3 bucket custom resource in it, is the S3 bucket also deleted or not?

The answer to this question (and the respective UX) should follow the "no surprises" principle. IOW: opting into (forcing) destructive operations rather than silently cascading the delete from the Kubernetes cluster perimeter to AWS services.

I agree to the sentiments of No surprises.
Storage or any other location where we store:

Data
State

shouldn't removed silently, instead, users should be informed about residual and what they should do in order to may be get a clean slate.

we can either have a config like:

auto_clean_on_delete: ["s3", "ecr", ...]

or force to add a similar tag for each of the resource, like: auto_remove_enabled:false some similar to what @donpinkster said, so that we know what are going to remove and what will automatically be cleaned.

Cheers,

dcherman · 2020-08-24T18:42:19Z

This sounds very similar to the reclaim policy feature of PVs:

https://kubernetes.io/docs/concepts/storage/persistent-volumes/#reclaiming

Unless there's a reason not to do so, it seems like it'd help reduce confusion if we re-used the same/similar field naming and terminology as the built-in PV object

reclaimPolicy: Retain | Delete

Quentin-M · 2020-08-25T19:10:14Z

The existing reclaim policy, which defaults to Retain and must be overridden to Delete / Recycle is nice indeed, but it may not work as well as it does for PV/PVCs because a PV that was formerly attached to a deleted PVC in Retain mode will still show in the PV list (as Released ) and therefore be cleaned up easily as necessary by an administrator a posteriori, whereas the S3 CR will be gone for good, making tracking/cleaning up harder (would have to go into AWS and rely on some heuristics or AWS Tags marking the staleness).

Rather, I'd suggest using the one Kubernetes feature designed specifically for those use-cases, the ValidatingAdmissionWebhook. With just about 15 lines of YAML, you can instruct Kubernetes to ask the question to your operator via its service endpoint whether the command should be honored or not. In your operator's code, you can then make an informed decision (command accepted/honored vs. command denied) based on the policy specified within the custom resource. By default, the CRD should set the policy of any new stateful CR to something safe. For instance, in the case of an S3 bucket, similarly to what the S3 UI has, the default policy could be "can't delete if the bucket is not empty", the operator can easily check whether the bucket is empty or not, and accept/deny the request. If the tenant wants to bypass that and make the operator delete the bucket & its content entirely, the tenant can issue a quick CR modification (e.g. kubectl edit/patch) to change to policy & issue the command again.

That'd also reduce the likelihood of a tenant having to re-import a CR after mistakenly and actually removing it from Kubernetes (but retaining it in AWS thankfully).. which is "painful" enough to do with a PV/PVC as they have to bind together. That does not imply that an "import/reconciliation" feature should not be written, which could also be very useful to migrate existing AWS resources (e.g. Terraform managed) into AWS Controller's management on Kubernetes, without downtime.

RedbackThomson · 2021-11-05T17:24:21Z

/lifecycle frozen

Issue #, if available: #82 Description of changes: Draft proposal for mechanisms to protect users against accidental deletion of ACK and/or AWS resources By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Issue #, if available: aws-controllers-k8s/community#82 Description of changes: As per the proposal: aws-controllers-k8s/community#1148 Adds support for a new optional CLI argument for the controller binary `--deletion-policy`, supporting `delete` and `retain` as available options. Setting this argument to `delete` leaves the controller as the current default behaviour, whereby it deletes the AWS resource under management before it is removed from the K8s API server. Setting this argument to `retain` modifies the default behaviour, leaving the AWS resources intact (taking no action on them) before removing it from the K8s API server. Adds support for a new annotation on `Namespace` objects: `{service}.services.k8s.aws/deletion-policy: delete/retain` (where `{service}` is the service alias of the controller eg. `s3`). This annotation overrides the deletion policy behaviour configured for the controller for all resources deployed under the namespace. Adds support for a new annotation on any ACK resource object: `services.k8s.aws/deletion-policy: delete/retain`. This annotation overrides only the deletion policy behaviour configured for the resource. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

RedbackThomson · 2023-01-19T00:10:29Z

Merged the deletion policy proposal and published the feature
#1148
aws-controllers-k8s/runtime#107

mhausenblas added kind/enhancement Categorizes issue or PR as related to existing feature enhancements. design labels Jul 15, 2020

mhausenblas pinned this issue Aug 6, 2020

mhausenblas self-assigned this Aug 16, 2020

mhausenblas added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Aug 24, 2020

Vlaaaaaaad mentioned this issue Aug 27, 2020

Possibility of "adopting" existing AWS resources #41

Closed

ack-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 26, 2021

aws-controllers-k8s deleted a comment from ack-bot Aug 27, 2021

jaypipes removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 27, 2021

jaypipes unassigned mhausenblas Aug 27, 2021

RedbackThomson mentioned this issue Oct 5, 2021

Resource deleted before finalizer is removed #1004

Closed

ack-bot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Nov 5, 2021

a-hilaly mentioned this issue Nov 5, 2021

Deletion Protection For ACK Provisioned Resources #1017

Closed

RedbackThomson mentioned this issue Jan 28, 2022

Semantics for destructive operations design proposal #1148

Merged

vijtrip2 unpinned this issue Jan 28, 2022

RedbackThomson pinned this issue Feb 25, 2022

RedbackThomson self-assigned this Feb 25, 2022

RedbackThomson added this to the ACK Core (2021-02-28 thru 2022-03-11) milestone Feb 25, 2022

a-hilaly removed this from the ACK Core (2021-02-28 thru 2022-03-11) milestone Mar 11, 2022

a-hilaly added this to the ACK Core (2021-03-14 thru 2022-03-25) milestone Mar 11, 2022

a-hilaly added kind/design Categorizes issue or PR as related to a technical design. and removed design labels Dec 13, 2022

RedbackThomson mentioned this issue Jan 6, 2023

Add deletion policy CLI arguments and annotations aws-controllers-k8s/runtime#107

Merged

RedbackThomson closed this as completed Jan 19, 2023

RedbackThomson unpinned this issue Jan 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Semantics of destructive operations #82

Semantics of destructive operations #82

mhausenblas commented Jul 15, 2020

donpinkster commented Aug 20, 2020

mhausenblas commented Aug 20, 2020

justinfiore commented Aug 20, 2020

joberdick commented Aug 20, 2020

jaypipes commented Aug 20, 2020

amitkarpe commented Aug 21, 2020 •

edited

szuecs commented Aug 22, 2020 •

edited

ravishtiwari commented Aug 24, 2020 •

edited

dcherman commented Aug 24, 2020

Quentin-M commented Aug 25, 2020

RedbackThomson commented Nov 5, 2021

RedbackThomson commented Jan 19, 2023

Semantics of destructive operations #82

Semantics of destructive operations #82

Comments

mhausenblas commented Jul 15, 2020

donpinkster commented Aug 20, 2020

mhausenblas commented Aug 20, 2020

justinfiore commented Aug 20, 2020

joberdick commented Aug 20, 2020

jaypipes commented Aug 20, 2020

amitkarpe commented Aug 21, 2020 • edited

szuecs commented Aug 22, 2020 • edited

ravishtiwari commented Aug 24, 2020 • edited

dcherman commented Aug 24, 2020

Quentin-M commented Aug 25, 2020

RedbackThomson commented Nov 5, 2021

RedbackThomson commented Jan 19, 2023

amitkarpe commented Aug 21, 2020 •

edited

szuecs commented Aug 22, 2020 •

edited

ravishtiwari commented Aug 24, 2020 •

edited