Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run a cleanup plan when deleting instances #1108

Merged
merged 9 commits into from
Dec 5, 2019
Merged

Conversation

nfnt
Copy link
Member

@nfnt nfnt commented Nov 27, 2019

What this PR does / why we need it:
If an operator provides a cleanup plan, this plan is run when the operator's instance is deleted from the cluster. This is achieved by detecting an instance deletion in the controller and adding a finalizer. If there is another plan currently in progress, this will be superseded by the cleanup plan. This allows the deletion of instances where a plan fails to complete.

Fixes #1046

Jan Schlicht added 2 commits November 27, 2019 10:00
If an operator provides a 'cleanup' plan, this plan is run when the operator's instance is deleted from the cluster.
This is achieved by detecting an instance deletion in the controller and adding a finalizer.
If there is another plan currently in progress, this will be superseded by the cleanup plan.
This allows the deletion of instances where a plan fails to complete.
Copy link
Contributor

@alenkacz alenkacz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to keep the controller code as compact as possible since it's the most important loop we have and isolate all the logic into the helper methods :) But other than that looking good!

Also there's one more case I need to think about to be sure we don't get stuck there :)

I am basically just trying to think if there's a way that we can remain stuck with the finalizer in, which is the most frustrating thing :) but I think we might be fine :)

pkg/controller/instance/instance_controller.go Outdated Show resolved Hide resolved
pkg/controller/instance/instance_controller.go Outdated Show resolved Hide resolved
pkg/controller/instance/instance_controller.go Outdated Show resolved Hide resolved
pkg/controller/instance/instance_controller.go Outdated Show resolved Hide resolved
Copy link
Contributor

@alenkacz alenkacz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing :)

pkg/controller/instance/instance_controller.go Outdated Show resolved Hide resolved
@nfnt
Copy link
Member Author

nfnt commented Dec 2, 2019

Thanks for the detailed case that isn't covered yet by the code. I'll change the logic to look like this:

  • First, only add the finalizer if the cleanup plan isn't complete.
  • Later, if there is a finalizer, remove it if the cleanup plan is complete.
    This is close to the current approach that checks if the last completed plan was the cleanup plan. But by chosing a more general approach, we'll cover reconciliation edge cases, like the one you described.

@nfnt nfnt requested a review from alenkacz December 4, 2019 09:20
Copy link
Contributor

@alenkacz alenkacz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I am just at the edge of approving 😛 sorry. I would just like to minimize the update requests from 2 to 1 since I think that's possible. The rest is just nits :)

pkg/controller/instance/instance_controller.go Outdated Show resolved Hide resolved
pkg/controller/instance/instance_controller.go Outdated Show resolved Hide resolved
pkg/controller/instance/instance_controller.go Outdated Show resolved Hide resolved
pkg/controller/instance/instance_controller.go Outdated Show resolved Hide resolved
Copy link
Contributor

@alenkacz alenkacz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, nice job! Don't forget about docs, please :)

pkg/controller/instance/instance_controller.go Outdated Show resolved Hide resolved
pkg/controller/instance/instance_controller.go Outdated Show resolved Hide resolved
@ANeumann82
Copy link
Member

Looks good to me. One thing I was thinking about was changes in operators:

  • When a new operatorVersion contains a cleanup plan that was not existing in the previous version, the finalizer should be added automatically on the next reconcile, correct?
  • If a cleanup plan gets removed from an operatorVersion, the finalizer still needs to be removed at some point. I've added a comment at maybeRemoveFinalizer. It's and edge case though.

@nfnt
Copy link
Member Author

nfnt commented Dec 4, 2019

@ANeumann82 Yes, the finalizer will be added automatically. But with the current code it won't get removed automatically. That's a good catch and is easy to fix, I'll add a fix.

@kensipe
Copy link
Member

kensipe commented Dec 4, 2019

My greatest concern here isn't with this code (which lgtm!).
We are developing a list of special plan names which we need to make absolutely clear to operator developers. the name "deploy" and now "cleanup" are special... they are not arbitrary. We need to capture in our documentation these details and the semantic behaviors of these special named plans.

Copy link
Member

@kensipe kensipe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nfnt
Copy link
Member Author

nfnt commented Dec 5, 2019

@kensipe 💯, this needs to be clear from the documentation (that I'll work on today). Also, operator devs should expect that the cleanup plan may fail, e.g. when trying to remove resources that haven't been created yet because the operator is stuck in deploy.

@nfnt nfnt merged commit 2d4c9ac into master Dec 5, 2019
@nfnt nfnt deleted the nfnt/delete-workflow-trigger branch December 5, 2019 10:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Delete instance workflow/trigger
4 participants