Skip to content
This repository has been archived by the owner on May 6, 2022. It is now read-only.

Determine best way to offer a mitigation for 'stuck' instances and bindings #1723

Closed
pmorie opened this issue Feb 8, 2018 · 12 comments
Closed
Assignees
Labels
Milestone

Comments

@pmorie
Copy link
Contributor

pmorie commented Feb 8, 2018

We have a few different issues where instances or bindings can become 'stuck' after deletion due to a few different reasons:

  • Broker went down
  • Bug in a broker that prevents broker from correctly being able to deprovision/unbind
  • etc etc

See: #1551 #908

A workaround for these cases is to edit the resource and remove the finalizer token for the catalog. That's a workable stopgap for now; it's arguable that there should be a better way to remove instances/bindings in a bad state.

Some information that I can add to this:

  • The --force option for kubectl is 100% client-side and does not differentiate the DeleteOptions sent to the API server in anyway for our resources
  • Same thing as --force for --now
  • It is likely that at some point in the future a special permission will be required to let a user delete a finalizer from a resource (Allow limiting adding/removing finalizers kubernetes/kubernetes#59591)

Another factor we must consider is that users, especially those new to Kubernetes, frequently believe that the catalog is somehow broken when an instance doesn't immediately disappear after it has been deleted. We don't want to give users a tool that they will use in an overly eager way and then run into problems because they used the method of last resort incorrectly as a convenience.

One possibility is that we could have a svcat command that:

  • Was clearly meant for administrators to use only
  • Clearly explained in red, capital letters that force-deleting a binding or instances has consequences, and the consequences potentially involve you paying money for resources that you are not using
  • Required you to interactively type Y or pass -y as an option
  • Would remove the service-catalog finalizer only from the list of finalizers
@pmorie pmorie added this to the 0.2.0 milestone Feb 8, 2018
@carolynvs
Copy link
Contributor

I am pro the "svcat nuclear option". 💣

@carolynvs
Copy link
Contributor

carolynvs commented Feb 8, 2018

Another factor we must consider is that users, especially those new to Kubernetes, frequently believe that the catalog is somehow broken when an instance doesn't immediately disappear after it has been deleted.

svcat should make it more promenent that a instance is scheduled to be deleted. Here's the issue that was on our old backlog for that: Azure/service-catalog-cli#70

UPDATE: I've copied that issue over #1728

@rthallisey
Copy link

The ansible-service-broker community was looking at tackling this from a broker perspective via openshift/ansible-service-broker#666.

To echo the concern here, we want to make sure the user is fully aware that if they want the service-instance always deleted then they break the guarantee that all resources are cleaned up. So the solution the community is thinking of merging is a broker config option called force_delete: true that is well documented with bold red letters. If enabled, it will attempt a deprovision, but always return a 200 to the serivce-catalog. I'm wondering what folks think of that as an alternative?

@rthallisey
Copy link

tagging folks @pmorie @carolynvs ^

@carolynvs
Copy link
Contributor

Even if the broker had a force delete, we may still want this cleanup functionality in service catalog for cases where the broker is not acting as expected, and the user really wants to remove the k8s resources.

@MHBauer
Copy link
Contributor

MHBauer commented Mar 12, 2018

I didn't know about

A workaround for these cases is to edit the resource and remove the finalizer token for the catalog.

To me that sounds less like a workaround and more like the exact appropriate thing to do given how the system works. Sure it would be nice to have a button to hit, but in the lack of any clear functionality provided to force-delete, is it really appropriate to go write server side code to solve this?

@carolynvs
Copy link
Contributor

To me that sounds less like a workaround and more like the exact appropriate thing to do given how the system works

I see this as a workaround in that our API doesn't have a way to deal with certain conditions. It's not unexpected to get into this state (or at least I seem to manage to do it often enough). If Service Catalog had an endpoint or flag to say "Hey this is totally borked, I give up and I just want to make it go away", service catalog could handle logging that it was asked to give up, remove the finalizer, and allow the resource to be deleted. It gives us more control over audit and mitigation, without telling people that it's expected that at times they will have to manually circumvent Service Catalog, which is an uncomfortable precedent to set.

@carolynvs
Copy link
Contributor

carolynvs commented May 18, 2018

In conjunction with #1728, here's what I think may be a workable UX:

$ svcat unbind --abandon
This action is not reversible and may cause you to be charged for the broker resources that are abandoned.
Are you sure? [y|N]

deleted RESOURCENAME

$ svcat deprovision --abandon
This action is not reversible and may cause you to be charged for the broker resources that are abandoned.
Are you sure? [y|N]

deleted RESOURCENAME

$ svcat help deprovision
Deletes an instance of a service

Usage:
  svcat deprovision NAME [flags]

Examples:

  svcat deprovision wordpress-mysql-instance


Flags:
      --abandon bool    Forcefully delete the resource from Service Catalog ONLY, potentially abandoning any broker resources that you may continue to be charged for.
``

@carolynvs carolynvs added the svcat label Jun 5, 2018
@carolynvs
Copy link
Contributor

I wrote an example plugin for kubecon that implements deleting finalizers off of a pod, which may help explain what this new svcat would need to do:

https://github.com/carolynvs/kubectl-kill-plugin

@MHBauer
Copy link
Contributor

MHBauer commented Jun 5, 2018

#1723 (comment)
@carolynvs I think that ux looks acceptable. I like the warning. It may need a -y flag or otherwise a way to operate it non-interactively.
@jberkhahn WDYT?

@carolynvs
Copy link
Contributor

On one hand, I think scripting this is going to end badly for someone. On the other, I ain't their momma, so why not? +1 for -y

@MHBauer
Copy link
Contributor

MHBauer commented Aug 6, 2018

/assign @carolynvs
/close

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants