Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: inverse targeting / exclude #2253

Open
shubhambhartiya opened this issue Jun 6, 2015 · 162 comments
Open

feature request: inverse targeting / exclude #2253

shubhambhartiya opened this issue Jun 6, 2015 · 162 comments

Comments

@shubhambhartiya
Copy link

Is there anything that can be done such that db_instance - RDS formed by the terraform files can be saved if we destroy the whole state?

@phinze phinze added the question label Jun 7, 2015
@phinze
Copy link
Contributor

phinze commented Jun 7, 2015

Hi @shubhambhartiya - we have prevent_destroy which provides protection against accidental destruction, but it sounds like perhaps you're asking about "destroy everything but this" feature.

Can you elaborate on the behavior you're looking for?

@phinze phinze added the waiting-response An issue/pull request is waiting for a response from the community label Jun 7, 2015
@shubhambhartiya
Copy link
Author

Considering an example.
I have a set of tf files which creates vpc, subnets, ASG, sg, instances in various subnets, nat instances and databases (RDS). I want to plan in this way such that when I destroy the plan, I want the RDS to be there (VPC and subnets would be needed), rest all the things would get destroy.

@phinze phinze removed the waiting-response An issue/pull request is waiting for a response from the community label Jun 8, 2015
@phinze
Copy link
Contributor

phinze commented Jun 8, 2015

Ah okay I get it now. I think I'd call what you're looking for "inverse targeting".

# Destroy everything except aws_db_instance.foo and its dependencies
terraform plan -destroy -exclude=aws_db_instance.foo

^^ If that looks like what you're asking for I'll edit the title and we can track that feature request with this thread.

@shubhambhartiya
Copy link
Author

Yes, that would be a great thing.

@phinze phinze changed the title RDS destroy - functionality feature request: inverse targeting / exclude Jun 8, 2015
@glerchundi
Copy link
Contributor

yes please! 👍

@josephholsten
Copy link
Contributor

this is related to #4515

@anosulchik
Copy link

Just to confirm that it would be nice to have a feature of inverted targeting like as follows:

terraform apply -target-exclude aws_ecs_service.ecs_service

Thanks.

@anagrius
Copy link

+1

@analytically
Copy link

+1

@beanaroo
Copy link

Another use case:

We're importing existing AWS environments. Migrating the DB to the new subnet group is a manual step. It would be nice to provision all the subnet/security/parameter groups before updating the instance (all part of the same module)

@cmacrae
Copy link

cmacrae commented Jul 28, 2017

I'd also love to see this. In the meantime, I'm using a combination of lifecycle to "protect" certain resources, and targeting like so:

terraform plan -destroy $(for r in `terraform state list | fgrep -v resource.address.to.exclude` ; do printf "-target ${r} "; done) -out destroy.plan

Not pretty, but it does the job 🙂

@olenm
Copy link

olenm commented Aug 16, 2017

Following what @anosulchik posted:

Just to confirm that it would be nice to have a feature of inverted targeting like as follows:terraform apply -target-exclude aws_ecs_service.ecs_service.

A target and something like a -target-exclude would be great to support regexp or by name-matching similar to consul, such as:
terraform apply -target-exclude aws_ecs_service. would match all that start with aws_ecs_service.
or if its regexp it can be more explicit which would be ideal

thanks

@ffoysal
Copy link

ffoysal commented Sep 16, 2017

terraform destroy -target-exclude aws_db_instance.my_rds
It would be great to have this feature. so that we can destroy everything except rds instance. It will save a significant amount of time for us if we can just destroy everything except rds resource, as rds takes around 30 minutes to create and timeout during destroy

@ColOfAbRiX
Copy link

ColOfAbRiX commented Oct 9, 2017

This would be really useful, so I can destroy everything except the resources marked with prevent_destroy. At the moment, because of prevent_destroy, I comment out everything except that code and run apply instead of destroy. Very unintuitive.

@henning
Copy link
Contributor

henning commented Oct 11, 2017

+1 :)

My current workaround is to taket the outout of "terraform plan list" , grep out all resource I wanna keep, and then create a list of -target parameters from the rest with a shell script.

Another thing that would make it supereasy to destroy everything unless the things you want to keep is to destroy all resources instead of those protected by the "prevent_destroy" flag.
Actually, in my opinion the behaviout for that flag is not ideal - if I call destroy, I want to destroy the configured resources, and to me it's somehow logical that the prevent_destroy flag is only for the one resource, not for the whole setup. The way it works now it protects the whole configuration, not the single resource from destruction. And this is the most mentioned reason why people need this here...

@laura-herrera
Copy link

Still, it would be very useful to have terraform apply -exclude as sometimes your ECS cluster has changed due to Autoscaling rules and you don't want to change that, but might want to add more resources, etc

@tigpt
Copy link

tigpt commented Dec 31, 2023

I agree. I felt the need for this because someone changed production in clickops creating a drift. I wanted to apply other changes but exclude the resource with the drift to fix it later.

I think we should not limit the tools to try to protect use cases, we should enable people to use them to help solve their pain points.

@MohnJadden
Copy link

Speaking as that guy who changed prod in clickops and caused a drift, the use case of targeting resources/scopes/etc. is an exceptionally helpful one. I understand that a product can't be coded to accommodate every single ask from the community, but given that the issues I linked a few comments up were caused by Terraform having issues executing its own functionality, it stands to reason that we should be able to say "Terraform, only do this much" if we want to.

Most of TF seems to be built with the idea that it's only meant for developers to build cloud environments, when in reality, cloud admins/engineers would like to make use of it when dealing with core infrastructure. Not every part of core infrastructure is updated at once, and in a larger org, it may not be feasible to have a requirement to apply against entire environments. Programs and products can and do grow, add features, etc. This seems like it's not too difficult to add filter logic, but the lack of any materially significant details as to why it isn't being done is a deafening silence. Not to breach code of conduct but if people are outwardly saying that a hard fork into a third party tool can fix the issue, shouldn't that be a reason to improve your own product?

I don't think it's tangential to ask for better details on how this issue is triaged and/or planned, assuming it is still high up the list in upvotes. I am not exactly king of the FOSS community - I'm just a humble Azure engineer - but asking this to go to the forums does not seem like a systemic way to address this issue in triage. We don't need to get full minutes of planning calls, but we are stakeholders and that should merit getting more info. Doubly so given its age - this issue has been open for 8.5 years now.

@Tbohunek
Copy link

Tbohunek commented Jan 2, 2024

You convinced me @MohnJadden, now I want it again! 😈
When I run into -exclude scenarios, I either modify manually or remove from state and reimport after.
But I'm tilted because for me there is more evil than good in -exclude.

@tburow
Copy link

tburow commented Jan 2, 2024

A lot of the history has been lost on this request. Ill start by saying I was an early requester of this feature, even talking through support channels to the H. engineers directly. bottom line - its not an easy thing to solve.

Level setting the technical fallout: One would have to reverse process the resource dependency tree interaction for each provider. AWS being the most popular, but also only one of many providers. This is a feature request would in the base code not the provider itself.

Using the AWS provider as an example - we all know the dependency tree in that provider has gaps and is not always in-sync with whats truth in the platform, especially on Destroy. With gaps (the providers, and possibly state-file scenarios), the dependencies on both creation and destroy have to be known with absolution in order for such an operation to work. Now considering this, multiply this by every provider & things get really complicated really quick and very easily nuke a state-file beyond any sane recovery.

it is possible to be implemented, but it would have a cascade effect through the code and all providers requiring dependency processing changes with strict validation. At some point - it was decided that the feature is not worth the effort, or better put - "The juice just isn't worth the squeeze"

Noting again - its a feature I would like to see - I think timing and overall supporting feature maturity within the codebase matters here. Perhaps with future improvements this will be more viable.

Click-ops is the no. 1 cardinal sin for IAC managed systems btw (no judgement, but we all know this)

@dimisjim
Copy link

dimisjim commented Jan 2, 2024

One would have to reverse process the resource dependency tree interaction for each provider

I am not privy to all discussions in this thread nor how resource targeting works step by step in the background, but I don't understand what this comment exactly means, or why it would be necessary to perform such actions with providers to achieve inverse targeting.

I would assume that the way -target=.. works is that it reads the code, finds a match and if it does, plans only for that resource (or multiple if many -target flags are defined) and returns the diff based on what it has also on the state file. Why wouldn't it be easy to implement the same (have multiple -target flags, except for the resources specified in a theoretical inverse targeting flag) by scanning the code in a similar way?

@Tbohunek
Copy link

Tbohunek commented Jan 2, 2024

@dimisjim excluding resources before plan is I think the difficulty @tburow talks about. It changes how the tree is constructed, especially if exclusions happen at multiple different scopes, and may lead to deadlocks.

I think full tree and full plan must be constructed first as usual, and then from this plan simply remove excluded scopes and all that depends on it. This to me sounds pretty simple, it's just a filtering operation on top of unchanged plan generation.

@weakcamel
Copy link

weakcamel commented Jan 3, 2024

I think full tree and full plan must be constructed first as usual, and then from this plan simply remove excluded scopes

There's one serious drawback to it: implementing it this way, you can't always skip resources which are unavailable due to HW failure (which I personally consider the most legit use for targeting).

An example I gave earlier in this thread: -exclude flag would be particularly useful is managing a vSphere cluster VMs where one or more host is temporarily disconnected due to HW issue. While host is disconnected, you can't pull any data from it to include in your plan, it's just inaccessible until fixed and brought back. -target flag currently is able to work around that.

@Tbohunek
Copy link

Tbohunek commented Jan 3, 2024

...can't always skip resources which are unavailable due to HW failure (which I personally consider the most legit use for targeting).

Fair point. It may be most legit, but is it most common? Sounds like very edge case + complex to implement + you have a viable workaround.
The above full plan implementation would help many other more common use-cases + simple to implement (I think), which makes it more lucrative to implement. It would be better than nothing, and could be improved later.

@pdfrod
Copy link

pdfrod commented Jan 3, 2024

My use case is similar to @weakcamel's: I have a SQL database on Google Cloud that is turned off most of time to reduce costs and when it's turned off, we can't pull data from it to include in the plan. I can manually turn on the database, but it's annoying because it takes minutes before the database comes back up, and then I have to remember to shut it down after Terraform work is done. It wouldn't be very practical either to create a separate Terraform project just for this single resource. This is the main reason why I'd like to have that "exclude" flag.

I don't see that the implementation of "exclude" needs to be that complicated. For me it would be enough if it was implemented on top of "target", where the list of targets are all the resources except the ones explicitly mentioned in the "exclude" flag.

Anyway, I'm not sure if there's any point in discussing this any further as it seems clear that this is not on Hashicorp's roadmap.

@dimisjim
Copy link

dimisjim commented Jan 3, 2024

I don't see that the implementation of "exclude" needs to be that complicated. For me it would be enough if it was implemented on top of "target", where the list of targets are all the resources except the ones explicitly mentioned in the "exclude" flag.

yeah, exactly. Seems to me that this can be inferred somehow from the code and the state file. Not sure why a complicated "each provider inverse dependence tree" calculation is needed as others have vaguely mentioned.

@weakcamel
Copy link

Fair point. It may be most legit, but is it most common? Sounds like very edge case + complex to implement + you have a viable workaround.

I wouldn't say it's most common yet I wouldn't say "edge case" either. It depends on your scale; hardware failures happen fairly often if you're running a lot of hosts. Connectivity problems happen too if your network is large enough or you're routing traffic through a network you don't fully control.

The most common use case for targeting is arguably drift between actual deployment vs TF code due to manual interventions; if you do that (which is sometimes a legit need), you're causing the problem yourself and there's an obvious way to prevent that from happening (just don't). In case of a HW failure or third party connectivity problems - not so much.

Also, implementing negative targeting differently than the (existing) positive targeting would be quite confusing to end users to say the least.

@MohnJadden
Copy link

I'm going to go out on a limb here and say that I honestly don't care why this is difficult to implement. That sounds callous as hell, but think of it like this - I'm an end user of Terraform, not a dev. I don't code, I don't intend to code, but I understand that this issue is an ask of the devs to undertake a task. All I (or others like me) can do is state that this would be useful, and the fact that OpenTofu was able to implement it leads me to believe that Hashicorp could implement it if they wanted to.

TF is a tool, in my opinion. This ask is sort of like asking for a different iteration of a tool: "As a socket wrench user, I need to unscrew bolts that are six or more inches inside an engine compartment. I want to be able to attach an extension of some kind that allows my socket wrench to reach inside the engine and unscrew these bolts." "As a hammer user, I need to hit surfaces with blunt force but not leave a mark. I want to be able to use a rubber mallet to hit these surfaces." Et cetera.

It'd be unreasonable to ask for a monkey wrench to have a hammer end, but IMO it'd be reasonable to ask for a monkey wrench to come in a larger size for bigger nuts or longer handle for harder to reach spots. This is what we're asking here - a tool with extended functionality.

@Tbohunek
Copy link

Tbohunek commented Jan 3, 2024

Most helpful would be to get the review result form Hashicorp / @crw.
Working on | Planned Q3 | Need more upvotes | Won't do because
Then we can talk how to push it further.

@crw
Copy link
Collaborator

crw commented Jan 3, 2024

@Tbohunek Release planning happens generally around the time (just before / just after) we are preparing the latest "minor" release (the next minor release would be 1.7.0). Given that this is the top issue by upvotes, it is being routinely re-evaluated.

@yermulnik

This comment was marked as off-topic.

@Tbohunek
Copy link

Tbohunek commented Jan 4, 2024

@Tbohunek Release planning happens generally around the time (just before / just after) we are preparing the latest "minor" release (the next minor release would be 1.7.0). Given that this is the top issue by upvotes, it is being routinely re-evaluated.

...and the result? Of the most recent re-evaluation? Pretty please?

@bgshacklett
Copy link

But I'm tilted because for me there is more evil than good in -exclude.

I would suggest that the evil is not in -exclude, but whatever workflow failure caused it to be needed. -exclude would, ideally, only be used as a tool to remediate said evil and get back on the right course. One could argue that -target can do the same thing, but in any sufficiently large code base, the ability to execute with targeting only can be prohibitively complex.

@Poltergeisen
Copy link

Poltergeisen commented Feb 8, 2024

I think this feature might be a good thing as well. I am using Azure and for Storage accounts I had a problem where private endpoints weren't working. I needed to re-deploy a module that creates a storage account and the private links for that storage account. The problem is that if there aren't any private endpoints, then the queries fail when trying to apply the terraform for the storage account.

If there was an "exclude" option, i'd just exclude applying the storage account and redeploy all the private links.

@ecoupal-believe

This comment was marked as off-topic.

@sebastian-blaszczak

This comment was marked as duplicate.

@weakcamel
Copy link

weakcamel commented Apr 19, 2024

Great feature indeeed. Yesterday we could not apply because the cloudflare api was down, have fun with -target option...

This comment has been marked as off-topic however IMO it's quite valid (despite the casual and somewhat sarcastic tone).

This is one of valid cases for implementing -exclude-target , similar to one of my previous vSphere examples (i.e. a physical outage in one of the X cluster ESXi hosts makes it impossible to apply changes to other, perfectly sane hosts).

@apparentlymart
Copy link
Contributor

The Terraform team is currently working on a concept called "deferred actions" that does something conceptually similar to -target -- plans actions for only a subset of the declared resources -- but takes a very different approach in the details.

The current -target option is problematic because it's implemented at a strange level of abstraction: it literally just deletes nodes blindly from Terraform's execution graph, with no real knowledge of the meaning of what it's doing, and so it can potentially cause Terraform to accept something that doesn't really make any sense and couldn't have occurred without -target. That means that teams often get "trapped" using -target for everything because they have accidentally created a situation that Terraform can't understand when it tries to produce a full plan.

Deferred actions takes a different approach: it still visits all of the resources declared in the configuration, but for any that are deferred it only performs validation and not planning. Terraform can still detect and report certain kinds of problems with the objects that have been deferred, and can still use approximations of their results to validate downstream objects that refer to other deferred resources.

The first goal for deferred actions is to close #30937 by making Terraform automatically defer anything that it knows it cannot plan yet due to unknown values being present. That would not directly affect this issue, but would remove one significant current use-case for -target by making Terraform "do the right thing" automatically.

But once that is dealt with it seems to me that a new explicit -defer=ADDR planning option would be the best way to close this issue, forcing Terraform to defer something even though Terraform doesn't know itself why it is being deferred. That would both meet the use-cases that motivated this issue and do so in a better way that wouldn't need so many cautions and caveats, and would give more confidence about whether a subsequent plan without deferrals is likely to succeed.

This is a significant new concept that needs plumbing throughout Terraform Core and so it'll take a while longer to be ready, but folks on the team are actively working on it.

@adamyodinsky
Copy link

+1

@heldersepu
Copy link

heldersepu commented Jul 2, 2024

it seems to me that a new explicit -defer=ADDR planning option would be the best way to close this issue

This is by a large margin the most upvoted issue on this repo...
@apparentlymart any update on this?

IF this is still been developed, maybe look at #2182 would be nice to be able to do -defer=aws*.staging*

@arpitgup
Copy link

Hello @phinze

I've encountered an issue related to the "prevent_destroy "feature and was unable to find a solution after reviewing various cases. My use case is fairly straightforward:

I have a Cloud Build pipeline with the following steps:

  1. Mutate organization policies in the GCP project (using prevent_destroy).
    Executing terraform apply.
  2. Wait for policies to propagate.
  3. Deploy Terraform resources (e.g., VMs, BigQuery, etc.).
  4. Revoke the mutated organization policies in the GCP project (from the same state file).
    Running terraform destroy with the prevent_destroy block enabled for few policies. I want to ensure that policies where the flag is set to true are not destroyed.

Ideally, I’m looking for a way to handle failures gracefully. Specifically, if an error occurs, I want Terraform to proceed by destroying resources with the flag set to false before exiting. Alternatively, if exiting with an error is unavoidable, I want it to first destroy those non-protected resources before terminating.

Any advice or guidance on how to achieve this would be greatly appreciated.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests