Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not delete snap label #205

Merged

Conversation

tesshuflower
Copy link
Contributor

Describe what this PR does
Provides a way for a user to label a snapshot with a do-not-delete label.

If the label is present and equal to "true":

  • Do not cleanup the snapshot in the cleanup cycle at the end of all mover's reconcile. Remove the volsync cleanup label and also remove ownership so the snapshot will not be deleted when the ReplicationDestination is deleted.
  • At the beginning of replication destination Reconcile, also check and remove the cleanup label and ownership.

Is there anything that requires special attention?
Even though this is really intended for ReplicationDestination only, technically the cleanup cycle will happen in movers for ReplicationSource too - so that cleanup would do the same thing if someone happened to add the do-not-delete label to a snapshot used by a replicationsource.

Related issues:
#196

For: backube#196

Signed-off-by: Tesshu Flower <tflower@redhat.com>
Signed-off-by: Tesshu Flower <tflower@redhat.com>
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 20, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tesshuflower

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@codecov
Copy link

codecov bot commented Apr 20, 2022

Codecov Report

Merging #205 (1a113f1) into main (91ca8bb) will decrease coverage by 2.4%.
The diff coverage is 68.3%.

@@           Coverage Diff           @@
##            main    #205     +/-   ##
=======================================
- Coverage   62.8%   60.3%   -2.5%     
=======================================
  Files         27      31      +4     
  Lines       2661    2898    +237     
=======================================
+ Hits        1672    1749     +77     
- Misses       860    1007    +147     
- Partials     129     142     +13     
Impacted Files Coverage Δ
controllers/mover/rclone/mover.go 73.8% <ø> (ø)
controllers/mover/restic/mover.go 78.8% <ø> (-0.9%) ⬇️
controllers/replicationdestination_controller.go 85.1% <33.3%> (-0.9%) ⬇️
controllers/volumehandler/volumehandler.go 63.0% <50.0%> (+0.3%) ⬆️
controllers/utils/cleanup.go 58.2% <70.7%> (ø)
controllers/mover/rsync/mover.go 78.0% <0.0%> (-1.3%) ⬇️
controllers/utils/reconcile.go 0.0% <0.0%> (ø)
controllers/utils/utils.go 0.0% <0.0%> (ø)
controllers/utils/sahandler.go 0.0% <0.0%> (ø)
... and 1 more

@tesshuflower
Copy link
Contributor Author

/cc @JohnStrunk

@openshift-ci openshift-ci bot requested a review from JohnStrunk April 22, 2022 15:21
Copy link
Member

@JohnStrunk JohnStrunk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add a check in the Snapshot CreateOrUpdate call volumehandler.go ensureImageSnapshot so that we don't add back the owner reference that we (may have) just removed in rd_controller.go?

logger logr.Logger, owner client.Object) error {
// Find all snapshots in the namespace with the do not delete label
listOptions := []client.ListOption{
client.MatchingLabels{DoNotDeleteLabelKey: DoNotDeleteLabelValue},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we require a specific value for the label or just its presence?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JohnStrunk would you prefer just to look for the label presence? I think I could use a label selector for that - but might be strange if someone adds a "false" value or something like that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thus the question... It's probably best to just look for it's presence and when we do the docs say something like label: somevalue. Looking for a specific value brings the issue of truthiness and capitalization, plus what if someone later sets it to a falsey value? We're not taking it back. 😁

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks John, that actually makes a lot of sense (particularly the "not taking it back part" :) ), I'll make the change to look for label presence only!

ownershipRemoved, err := RemoveSnapshotOwnershipIfRequestedAndUpdate(ctx, c, logger,
owner, &snapshot)
if err != nil {
return remainingSnapshots, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to stop processing the list if we fail updating one snapshot? What if we encounter a situation where we're unable to delete a given snapshot for some reason (e.g., permission issues). It'll prevent us from deleting ones that we should be able to remove. This would be fine if we were sure any error would be transient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this makes sense, will go though the list and throw the error at the end if one occurs

// Remove ReplicationDestination owner reference if present
updatedOwnerRefs := []metav1.OwnerReference{}
for _, ownerRef := range obj.GetOwnerReferences() {
if ownerRef.Kind+"/"+ownerRef.Name == ownerKindAndName {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about checking for a UID match instead? The kind/name matching makes me worry about race conditions w/ recreating objects of the same name.

if err != nil {
logger.Error(err, "error removing cleanup label or ownerRef from snapshot",
"name", snapshot.GetName(), "namespace", snapshot.GetNamespace())
return ownershipRemoved, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this returns true, err if the Update fails. Is that the correct semantics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the logic got a little convoluted as true ended up being the way I detect (in outer calling function) whether the snapshot should be put in the list that I can then cleanup - think it does make sense to return false here.


// Remaining snapshots should be cleaned up
for i := range snapsForCleanup {
snapForCleanup := &snapList.Items[i]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this right? not &snapsForCleanup[i]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this right? not &snapsForCleanup[i]?

Yikes! This is a bad one - nice catch @JohnStrunk

Comment on lines 117 to 125
snapResourceVersion := snapForCleanup.ResourceVersion

deleteOptions := &client.DeleteOptions{
Preconditions: &metav1.Preconditions{
ResourceVersion: &snapResourceVersion,
},
}

err := c.Delete(ctx, snapForCleanup, deleteOptions)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be simplified to:

Suggested change
snapResourceVersion := snapForCleanup.ResourceVersion
deleteOptions := &client.DeleteOptions{
Preconditions: &metav1.Preconditions{
ResourceVersion: &snapResourceVersion,
},
}
err := c.Delete(ctx, snapForCleanup, deleteOptions)
err := c.Delete(ctx, snapForCleanup, client.Preconditions{ResourceVersion: &snapForCleanup.ResourceVersion})

@tesshuflower
Copy link
Contributor Author

Do we need to add a check in the Snapshot CreateOrUpdate call volumehandler.go ensureImageSnapshot so that we don't add back the owner reference that we (may have) just removed in rd_controller.go?

I was undecided on this - I was hoping not to add the checks for the label all over the place so ended up not inserting another check there - my thinking is that even if we did add back the owner reference, once we get to the cleanup stage it'll get removed again.

I guess the reverse issue is if someone then deletes the replication destination before we cleanup - I think going over that scenario does uncover a case where we're better of adding the extra check.

- match by owner using UID instead of kind/name
- fix confusing true returned in error case
- fix in cleanup snap using wrong list
- simplify delelte with precondition call

Signed-off-by: Tesshu Flower <tflower@redhat.com>
- Match do-not-delete label by only the label name, ignore value
  (if the label exists, we will not cleanup and will relinquish
   ownership)
- In volume handler ensureImageFromSnapshot() - check for label
  here (and remove ownership etc) as well to prevent adding it back
- unit test updates

Signed-off-by: Tesshu Flower <tflower@redhat.com>
@tesshuflower
Copy link
Contributor Author

/retest

@JohnStrunk
Copy link
Member

/lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants