Destroy 'provisioner' for instance resources #386

kforsthoevel · 2014-10-10T14:14:07Z

I would be great to have sort of a 'provisioner' for destroying an instance resource.

Example:
When creating a instance, I bootstrap it with chef and the node is registered with the chef server. Now I need a way of automatically deleting the node from the chef server after terraform destroys the instance.

pmoust · 2014-10-10T14:32:44Z

Yes, a provisioner that would be called on_destroy.
Would be useful for puppet as well, to clean up certificates so that the same(getting into puppetmaster and issuing puppet cert clean ${aws_route53_record.foobar.*.cname).
But more importantly the use of such on_destroy event has some critical cases:

In some cases you need to do general cleanup before destroying an instance, (or push logs to storage)
gracefully deregistering it from continuous delivery stack
shutting off services gracefully
notifying monitoring tools that is ok that the instance is dieing etc.

mitchellh · 2014-10-10T15:53:38Z

What is the behavior if the provisioner fails? Does the destroy fail and it is run again until it succeeds?

I'm open to the idea, but I'm not sure I see how Terraform could ever safely do this.

With provisioning on create, it is much simpler (for reference): The resource is created, the provisioners are run. If any provisioner fails, the resource is "tainted", and the entire thing is destroyed/created and tried again on a subsequent run.

Destroy provisioners don't have the same "taint" concept, which results in them going into an uncertain state that Terraform can't reason about. If Terraform can't reason about it, we can't safely change infrastructure (one of the core tenets of the project). This might imply that this feature is not fit for this project.

kforsthoevel · 2014-10-10T18:54:53Z

I totally understand your point and honestly I don't have a smart answer.
But I can offer two options which would work for me and hopefully for
others as well.

I see the destroy provisioner as a post destroy hook. So the instance
should be destroyed by terraform no matter what. After this,
terraform should try to execute the destroy provisioner and in case it
fails:

just print out a warning and let me handle the fallout manually or
remember the state and re-execute only the destroy provisioner on the
next run. Maybe till it succeeds.

Does this make sense? I really think terraform is awesome and I hope to use
it together w/ chef.

armon · 2014-10-12T23:44:46Z

I think we can run the destroy provisioner before the destroy step itself. This way if the provisioner fails, we abort the destroy, and the user can re-attempt on another Terraform run.

mitchellh · 2014-10-13T00:28:49Z

@armon What if there are multiple provisioners though? It would force us to keep track of provisioner state. Maybe that is part of the feature, but just worth pointing out.

armon · 2014-10-13T05:05:32Z

I think they have to be idempotent or just plan to run multiple times in some cases. We don't have to track their state, we just abort the Apply() on that node. Then we just retry it later. I think for most cases (de-registering servers) this should be fine.

pmoust · 2014-10-14T18:59:13Z

I am with @armon on this one.

Three rules:

provisioners' actions must be idempotent
if a provisioner fails, destroy of resource is aborted.
a force option should taint/destroy the resource even if the on_destroy provisioner fails.

woodhull · 2015-04-21T21:27:39Z

Agree that this would be a useful feature, and that idempotency should keep it safe. Options from @pmoust seems reasonable.

tehnorm · 2015-04-23T18:11:54Z

This would be an excellent feature - we've found need for being able to clean up before a resource is destroyed.

mlrobinson · 2015-05-20T03:16:28Z

+1

menicosia · 2015-06-03T00:16:24Z

+1

dalehamel · 2015-07-07T13:26:55Z

With the addition of the 'chef' provisioner, IMHO this is a must have.

Right now terraform can provision chef instances, but not actually destroy them. This leaves stale clients and nodes on the chef server.

@mitchellh I see this being implemented as a 'destroy' block within the provisioner block itself. If it's there, then on_destroy is called before destroy is called.

Without this feature, terraform is going to keep causing garbage to pile up on our chef server. We'd be happy to spend some time on a POC PR.

dalehamel · 2015-07-07T13:53:44Z

cc @thegedge

dhoer · 2015-07-24T01:31:33Z

👍 +1

I'm surprised packer does this, but terraform doesn't: https://www.packer.io/docs/provisioners/chef-client.html#skip_clean_client

vjanelle · 2015-09-29T16:14:21Z

@mitchellh do what puppet does? have a test to see if it needs to execute the on_destroy?

queeno · 2015-10-01T15:56:24Z

👍 +1

grosendorf · 2015-10-02T20:31:54Z

👍 +1

josephholsten · 2015-11-06T21:54:31Z

so what's the next step here? create a proof-of-concept nop destroy provisioner and add the hooks? I like @pmoust's three rules, which can become these test cases:

provisioner succeeds, destroy the resource
provisioner fails, destroy of resource is aborted
force option provided, provisioner fails, still destroy the resource

Should there be an option to skip the on_destroy provisioner altogether?

apparentlymart · 2015-11-07T01:47:16Z

The "junk in Chef Server" problem (which I also have!) makes me think that we should consider letting a single provisioner hook in to both the create and destroy parts of the lifecycle. Possibly to update, too.

It'd be nice if you could just add provisioner "chef" and it would integrate with both the create and destroy actions and tidy up during destroy.

Of course at that point the provisioner starts to look quite a lot like a resource. In #3084 I included a chef_node resource that creates the chef server object but isn't able to actually get the server set up. I spent some time trying to figure out a suitable workflow there and didn't hit one; a provisioner being able to hook into destroying could be the missing link to make that work well.

FergusNelson · 2015-11-20T15:32:22Z

Another use case for this feature is to allow an unmount command to run before a aws_volume_attachment gets destroyed from a running instance.

w1lnx · 2016-01-06T02:44:16Z

+1

trawler · 2016-01-06T13:35:10Z

+1

mitchellh · 2016-11-11T17:03:51Z

@kris-nova Looks good! My thoughts (that may overlap with yours, just trying to be as detailed as possible):

Config changes, looks like you have those on lockdown.
Change EvalTree for apply operation: ignore destroy provisioners
Change EvalTree for destroy operation: run destroy provisioners prior to resource apply
Probably need to modify the EvalApplyProvisioner EvalNode to understand the on_destroy semantics to return an error or not (hence halting the evaluation process, hence halting the destruction of the resource)
Copious tests in context_destroy_test.go and context_apply_test.go to be able to test this stuff in-memory.

Those together would probably do it!

Let me know @kris-nova where your comfort level is and when you want me or don't want me to jump in and I'm happy to provide input. I think your RFC is solid (I actually missed it before you mentioned it!), the only reaction I'd have is maybe splitting up the on_destroy into two fields later to give us a when field (when to run) and a field to specify behavior (attempt/require), this would allow us to build future provisioner lifecycle stuff (only on update, for example). HOWEVER, that is a nitpick and bike shedding in the sense that those changes are easy relative to the core feature itself, so I'll ignore those. :)

dmourati · 2016-11-13T06:48:46Z

Woohoo. Reviewing this thread after a similar discussion in #649.

Techcadia · 2016-11-23T18:44:23Z

Is this still targeted for v0.8?

mitchellh · 2016-11-23T18:49:50Z

It didn't make it, sorry! This was my fault for not starting this up earlier. But that's okay, we'll get it into 0.9 for sure. It may even sneak into a 0.8.x release depending on the type of changes necessary.

Techcadia · 2016-11-23T18:52:50Z

Thanks for the update @mitchellh we are starting to work through the issues with Chef AD and Consul and how we remove old stale servers.

imduffy15 · 2016-11-24T02:21:15Z

Would love to hear what you end up with @Techcadia

We ended up just having a job that runs every X minutes to detect stale objects.

stack72 · 2016-11-25T11:39:34Z

@Techcadia / @imduffy15 I usually baked a script into the instance that ran a removal from our chef server on box termination.

imduffy15 · 2016-11-25T13:16:06Z

@stack72 How are you telling the difference between shutdown, reboot and terminate?

ashald · 2016-12-02T14:55:22Z

@mitchellh we're missing this sooo much for managing our Consul instances in AWS. Would be very happy to see the feature! Hope it will make its way into the next release.

eedwardsdisco · 2016-12-02T19:50:40Z

@ashald have you tried using the consul leave_on_terminate feature?

ashald · 2016-12-21T07:18:40Z

@eedwardsdisco that's a dangerous option for server nodes.

It's a Christmas time, let's hope a miracle will happen and it will sneak into one of 0.8.X releases! :)

akaspin · 2016-12-21T10:55:29Z

Any chances to get provision on destroy?

nbering · 2017-01-18T03:31:33Z

Aside from the branch on @kris-nova's fork - which is currently over two thousand commits behind - is there anywhere work on this is being done? Is there an open PR?

asciifaceman · 2017-01-18T03:36:26Z

I've long since given up hope on this ever being worked on.

mitchellh · 2017-01-18T03:41:19Z

No open PR at the moment, but it is on the roadmap for 0.9. :)

nbering · 2017-01-18T04:02:47Z

@mitchellh Is there an official roadmap published somewhere else?

https://github.com/hashicorp/terraform/milestone/2

mitchellh · 2017-01-18T04:07:36Z

There is not. Years and years ago (pre-Terraform) I used to but I've been burned too many times by people being very upset when something doesn't make it in. I've learned since to not make such commitments.

That being said, its on the roadmap. We hope to ship it.

mitchellh · 2017-01-21T04:00:57Z

Howdy folks, I've continued the fantastic RFC from this issue and this is the final RFC that I'm going to run with to build this: https://docs.google.com/document/d/15nEcV7fxskDgYrXoNMl6RYIo10PCiZGle7TP8xitrFE/edit#

The work is in f-destroy-prov and is just about complete now. It is just missing some polish work around validation and docs. So I can confirm this will 100% make it into TF 0.9. Thanks so much to @kris-nova for the original RFC work, that helped save copious time during the design phase so I could just start going with implementation!

krisnova · 2017-01-21T04:05:18Z

Wow! This is fantastic work @mitchellh 🥇

Your RFC 2.0 looks like exactly what the community needs, and I am so glad to see this coming to life. Again, can't thank you enough for all that you do. Looking forward to the PR - and to one day having a long awaited closure on the issue.

Go Terraform!

mitchellh · 2017-01-21T04:23:56Z

If on_failure = "continue" is set (not the default) then we'll just continue and it'll only be in the output. The output may be noisy but we don't have a better way to surface that information at the moment (bonus: it'll be in red, so that helps if you have colors on). If this becomes a real issue then we can consider some other options, but I'd rather get the working feature out into the hands of the community and see where pain points end up being. 🤢

krisnova · 2017-01-21T04:30:44Z

Ah thanks - I nuked my original question (as I read on I found more information in the implementation section..) which was

How will Terraform handle a failed destroy provisioner? Will this be tracked anywhere?

+1 to iterating on this.. I think you are nailing the feature exactly as we want, and I am pumped to see where this evolves to!

I am curious though.. what if we have an on_failure = "continue" and the provisioner itself breaks the program? (Might be a question for another forum, so feel free to defer me) Will terraform recover from the potential fatal, and continue destroying?

mitchellh · 2017-01-21T04:36:45Z

I think the discussion here is fine. 👍

If a provisioner returns any error (connection error, execution error, etc.) it will continue destruction. If the provisioner causes Terraform to crash we'll not, but I think that's a reasonable tradeoff (provisioners should not crash Terraform 😛).

The broadness on the "failure" type is necessary due to the current core <=> provisioner API interface. Changing that would be pretty messy/annoying/deeper. Again, possible if it ends up adding high value, but not something I'd casually do unless there was enough of a proven use case.

Hope I answered your Q right... I didn't fully understand it (late Friday night reading perhaps an excuse. I dunno, but prob my own fault).

krisnova · 2017-01-21T04:40:23Z

Great explanation! Thanks! You're right.. maybe I should get back to Friday night. Cheers

ghost · 2020-04-17T02:23:36Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

mitchellh added the waiting-response An issue/pull request is waiting for a response from the community label Oct 10, 2014

mitchellh added thinking and removed waiting-response An issue/pull request is waiting for a response from the community labels Oct 11, 2014

sethvargo changed the title ~~Feature request: Destroy 'provisioner' for instance resources~~ Destroy 'provisioner' for instance resources Nov 19, 2014

dalehamel mentioned this issue Jul 7, 2015

Destroy Action #649

Closed

phinze mentioned this issue Jul 15, 2015

aws_instance create_before_destroy and ELB behavior #2717

Open

bloopletech mentioned this issue Aug 7, 2015

Error when instance changed that has EBS volume attached #2957

Closed

apparentlymart mentioned this issue Jan 8, 2016

provisioner/chef: unregister the client from chef server when replacing an instance #3605

Closed

5tefan mentioned this issue Nov 18, 2016

reducing workers leaves kube nodes in place kubernetes-digitalocean-terraform/kubernetes-digitalocean-terraform#21

Open

mitchellh removed the thinking label Dec 4, 2016

mitchellh mentioned this issue Jan 21, 2017

Destroy Provisioners #11329

Merged

mitchellh closed this as completed in #11329 Jan 26, 2017

nbering mentioned this issue Aug 18, 2017

Before destroy hook #15858

Closed

ghost locked and limited conversation to collaborators Apr 17, 2020

Destroy 'provisioner' for instance resources #386

Destroy 'provisioner' for instance resources #386

Comments

kforsthoevel commented Oct 10, 2014

pmoust commented Oct 10, 2014

mitchellh commented Oct 10, 2014

kforsthoevel commented Oct 10, 2014

armon commented Oct 12, 2014

mitchellh commented Oct 13, 2014

armon commented Oct 13, 2014

pmoust commented Oct 14, 2014

woodhull commented Apr 21, 2015

tehnorm commented Apr 23, 2015

mlrobinson commented May 20, 2015

menicosia commented Jun 3, 2015

dalehamel commented Jul 7, 2015

dalehamel commented Jul 7, 2015

dhoer commented Jul 24, 2015

vjanelle commented Sep 29, 2015

queeno commented Oct 1, 2015

grosendorf commented Oct 2, 2015

josephholsten commented Nov 6, 2015

apparentlymart commented Nov 7, 2015

FergusNelson commented Nov 20, 2015

w1lnx commented Jan 6, 2016

trawler commented Jan 6, 2016

mitchellh commented Nov 11, 2016

dmourati commented Nov 13, 2016

Techcadia commented Nov 23, 2016

mitchellh commented Nov 23, 2016

Techcadia commented Nov 23, 2016

imduffy15 commented Nov 24, 2016

stack72 commented Nov 25, 2016

imduffy15 commented Nov 25, 2016

ashald commented Dec 2, 2016

eedwardsdisco commented Dec 2, 2016

ashald commented Dec 21, 2016

akaspin commented Dec 21, 2016

nbering commented Jan 18, 2017

asciifaceman commented Jan 18, 2017

mitchellh commented Jan 18, 2017

nbering commented Jan 18, 2017

mitchellh commented Jan 18, 2017

mitchellh commented Jan 21, 2017 • edited Loading

krisnova commented Jan 21, 2017

mitchellh commented Jan 21, 2017

krisnova commented Jan 21, 2017

mitchellh commented Jan 21, 2017

krisnova commented Jan 21, 2017

ghost commented Apr 17, 2020

mitchellh commented Jan 21, 2017 •

edited

Loading