Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

output-all <var_name> gives up too easily #193

Closed
mrtyler opened this issue Apr 26, 2017 · 10 comments
Closed

output-all <var_name> gives up too easily #193

mrtyler opened this issue Apr 26, 2017 · 10 comments
Labels
enhancement New feature or request

Comments

@mrtyler
Copy link
Contributor

mrtyler commented Apr 26, 2017

Context

I'm trying to use Terragrunt-style modules with kitchen-terraform for testing. Since kitchen-terraform is designed for the vanilla terraform workflow (whereas I need terragrunt's apply-all to invoke modules together) and since kitchen-terraform doesn't provide a lot of flexibility in configuring the terraform commands it issues, I'm working on a (currently very rough) wrapper script to translate from vanilla terraform to terragrunt *-all.

kitchen-terraform uses terraform output and terraform output <var_name> to collect values from the provisioned system (e.g. the worker module provides a public_ip variable which is used by integration tests that want to connect to the worker instance). I'm not sure why it needs two passes to do this; I'll be looking into that with the kitchen-terraform folks next :).

Demonstrating the problem

Minimal terraform code is here but it's just this:

### ./base/main.tf
output "base_var" { value = "i am base_var" }

### ./base/terraform.tfvars
terragrunt {}

### ./worker/main.tf
output "worker_var" { value = "i am worker_var" }

### ./worker/terraform.tfvars
terragrunt {
  dependencies {
    paths = ["../base"]
  }
}

Here's terragrunt output-all worker_var:

(Note that terragrunt master contains a bug such that output-all visits its dependencies backwards. I'm using my fix from #194 to hopefully make the example easier to follow. If you are trying to reproduce the problem with master, the behavior for base_var and worker_var will be reversed (or you can build terragrunt including my fix.))

$ go run ~/code/go/src/github.com/gruntwork-io/terragrunt/main.go output-all base_var
[terragrunt] [/Users/tyler/rtf/tg-all] 2017/04/25 23:35:36 Running command: terraform --version
[terragrunt] 2017/04/25 23:35:36 Stack at /Users/tyler/rtf/tg-all:
  => Module /Users/tyler/rtf/tg-all/base (dependencies: [])
  => Module /Users/tyler/rtf/tg-all/worker (dependencies: [/Users/tyler/rtf/tg-all/base])
[terragrunt] [/Users/tyler/rtf/tg-all/worker] 2017/04/25 23:35:36 Module /Users/tyler/rtf/tg-all/worker must wait for 1 dependencies to finish
[terragrunt] [/Users/tyler/rtf/tg-all/base] 2017/04/25 23:35:36 Module /Users/tyler/rtf/tg-all/base must wait for 0 dependencies to finish
[terragrunt] [/Users/tyler/rtf/tg-all/base] 2017/04/25 23:35:36 Running module /Users/tyler/rtf/tg-all/base now
[terragrunt] [/Users/tyler/rtf/tg-all/base] 2017/04/25 23:35:36 Reading Terragrunt config file at /Users/tyler/rtf/tg-all/base/terraform.tfvars
[terragrunt] [/Users/tyler/rtf/tg-all/base] 2017/04/25 23:35:36 Running command: terraform output base_var
i am base_var
[terragrunt] [/Users/tyler/rtf/tg-all/base] 2017/04/25 23:35:36 Module /Users/tyler/rtf/tg-all/base has finished successfully!
[terragrunt] [/Users/tyler/rtf/tg-all/worker] 2017/04/25 23:35:36 Dependency /Users/tyler/rtf/tg-all/base of module /Users/tyler/rtf/tg-all/worker just finished succesfully. Module /Users/tyler/rtf/tg-all/worker must wait on 0 more dependencies.
[terragrunt] [/Users/tyler/rtf/tg-all/worker] 2017/04/25 23:35:36 Running module /Users/tyler/rtf/tg-all/worker now
[terragrunt] [/Users/tyler/rtf/tg-all/worker] 2017/04/25 23:35:36 Reading Terragrunt config file at /Users/tyler/rtf/tg-all/worker/terraform.tfvars
[terragrunt] [/Users/tyler/rtf/tg-all/worker] 2017/04/25 23:35:36 Running command: terraform output base_var
The output variable requested could not be found in the state
file. If you recently added this to your configuration, be
sure to run `terraform apply`, since the state won't be updated
with new output variables until that command is run.
[terragrunt] [/Users/tyler/rtf/tg-all/worker] 2017/04/25 23:35:37 Module /Users/tyler/rtf/tg-all/worker has finished with an error: exit status 1
[terragrunt] 2017/04/25 23:35:37 Encountered the following errors:
exit status 1
[terragrunt] 2017/04/25 23:35:37 Unable to determine underlying exit code, so Terragrunt will exit with error code 1
exit status 1

The good news is that the variable we asked for is in the output (i am base_var). The bad news is this looks and feels like a failure due to the warning message and non-zero exit code.

Worse, it is now impossible to get variables from the worker module since it depends on base, which fails. Note that the expected output i am worker_var does not appear:

$ go run ~/code/go/src/github.com/gruntwork-io/terragrunt/main.go output-all worker_var
[terragrunt] [/Users/tyler/rtf/tg-all] 2017/04/25 23:35:44 Running command: terraform --version
[terragrunt] 2017/04/25 23:35:44 Stack at /Users/tyler/rtf/tg-all:
  => Module /Users/tyler/rtf/tg-all/base (dependencies: [])
  => Module /Users/tyler/rtf/tg-all/worker (dependencies: [/Users/tyler/rtf/tg-all/base])
[terragrunt] [/Users/tyler/rtf/tg-all/worker] 2017/04/25 23:35:44 Module /Users/tyler/rtf/tg-all/worker must wait for 1 dependencies to finish
[terragrunt] [/Users/tyler/rtf/tg-all/base] 2017/04/25 23:35:44 Module /Users/tyler/rtf/tg-all/base must wait for 0 dependencies to finish
[terragrunt] [/Users/tyler/rtf/tg-all/base] 2017/04/25 23:35:44 Running module /Users/tyler/rtf/tg-all/base now
[terragrunt] [/Users/tyler/rtf/tg-all/base] 2017/04/25 23:35:44 Reading Terragrunt config file at /Users/tyler/rtf/tg-all/base/terraform.tfvars
[terragrunt] [/Users/tyler/rtf/tg-all/base] 2017/04/25 23:35:44 Running command: terraform output worker_var
The output variable requested could not be found in the state
file. If you recently added this to your configuration, be
sure to run `terraform apply`, since the state won't be updated
with new output variables until that command is run.
[terragrunt] [/Users/tyler/rtf/tg-all/base] 2017/04/25 23:35:44 Module /Users/tyler/rtf/tg-all/base has finished with an error: exit status 1
[terragrunt] [/Users/tyler/rtf/tg-all/worker] 2017/04/25 23:35:44 Dependency /Users/tyler/rtf/tg-all/base of module /Users/tyler/rtf/tg-all/worker just finished with an error. Module /Users/tyler/rtf/tg-all/worker will have to return an error too.
[terragrunt] [/Users/tyler/rtf/tg-all/worker] 2017/04/25 23:35:44 Module /Users/tyler/rtf/tg-all/worker has finished with an error: Cannot process module Module /Users/tyler/rtf/tg-all/worker (dependencies: [/Users/tyler/rtf/tg-all/base]) because one of its dependencies, Module /Users/tyler/rtf/tg-all/base (dependencies: []), finished with an error: exit status 1
[terragrunt] 2017/04/25 23:35:44 Encountered the following errors:
exit status 1
Cannot process module Module /Users/tyler/rtf/tg-all/worker (dependencies: [/Users/tyler/rtf/tg-all/base]) because one of its dependencies, Module /Users/tyler/rtf/tg-all/base (dependencies: []), finished with an error: exit status 1
[terragrunt] 2017/04/25 23:35:44 Unable to determine underlying exit code, so Terragrunt will exit with error code 1
exit status 1

Pragmatically, this is preventing me from getting kitchen-terraform and terragrunt to play together. Philosophically, I think it is incorrect for terraform output-all some_variable to exit non-zero just because some_variable isn't defined in every single module, but I am still new to Terraform so please let me know if I'm missing something here.

I'm also new to Go and the terragrunt code base, but my first guess about how to solve this problem would be a new flag passed among configstack/running_module::waitForDependencies and friends that ignores errors in dependencies. Perhaps there's a cleaner way?

@brikis98
Copy link
Member

Just to check that I'm understanding the issue:

  1. You have two modules, foo and bar. Module foo has an output called XXX, but bar does not.
  2. Your run terragrunt output-all XXX.
  3. Terragrunt exits with an error because output XXX does not exist in bar.

To be honest, this seems like the correct behavior from Terragrunt, simply because that's the exact behavior from Terraform. If the output you requested is missing, exit with an error.

That said, it may make sense to add a new flag, such as terragrunt-ignore-dependency-errors, which tells the xxx-all commands in Terragrunt to run, even if a dependency exits with an error. I'm not sure how common of a use case that is, though.

@mrtyler
Copy link
Contributor Author

mrtyler commented Apr 26, 2017

Hi Yevgeniy! Great to talk to you after reading your blog posts and stalking you in r/devops :p.

Your summary is correct.

To be honest, this seems like the correct behavior from Terragrunt, simply because that's the exact behavior from Terraform. If the output you requested is missing, exit with an error.

In general I agree: it doesn't make sense to apply worker if apply base failed. However, I think output-all is a special case. Let me try to make my example more concrete.

I have two terraform components, base and worker. base sets up shared infrastructure like networking and routing, so its output contains variables like subnet_id. worker sets up an instance, so its output contains variables like public_ip. The instance needs all that networking stuff to be provisioned before it can do anything, so worker depends on base. I believe this is a very common Terraform pattern.

I want to fetch public_ip so I can pass it off to another tool. I run terragrunt output-all public_ip. What do we expect to happen?

Now I want to fetch subnet_id so I can pass it off to another tool. I run terragrunt output-all subnet_id. What do we expect to happen now?

My expectation in both cases is that I get the variable I want with a successful return code (the presence of warning text about undefined variables is somewhat confusing in this case but not a huge deal compared to exiting non-zero thereby signaling failure to the caller of terragrunt). I don't care that public_ip isn't defined in every component -- why would it be? Please give me the single value of public_ip that I asked for.

Does this differ from your expectation? Do you expect the two scenarios to behave differently? Because they do :) -- the former returns warnings + nonzero exit code without the requested value, the latter returns warnings + nonzero exit code with the requested value. At the very least this seems inconsistent to me.

Does this better explain why I think this is a problem? Am I missing something?

I'm not sure how common of a use case that is, though.

For apply-all, I expect it would be quite rare.

For destroy-all, I can imagine a scenario where I just want to destroy everything as fast and as hard as possible, plan to clean up any loose ends afterwards via the AWS console, and don't care about leaving the backend instance up just because the frontend instance hasn't died yet. Probably not very common but plausible, perhaps.

For output-all, I think ignoring dependencies should be the default because the alternative is both confusing and preventing me from using kitchen-terraform :). That said, based on my experience it appears I am the only human in the galaxy that is trying to use terragrunt-style modules with kitchen-terraform integration tests. I guess that's a pretty uncommon use case! But it is blocking me and may force me to abandon terragrunt in favor of vanilla terraform + templated terraform files.

@brikis98
Copy link
Member

I run terragrunt output-all public_ip. What do we expect to happen?

Out of curiosity, why run terragrunt output-all XXX instead of going into the subfolder you care about and running terragrunt output XXX? If you don't expect all the outputs to contain the output XXX, doesn't it make sense to just ask for the one output you want?

@mrtyler
Copy link
Contributor Author

mrtyler commented Apr 26, 2017

Pragmatic answer: because that's the way kitchen-terraform works. Investigating that is next on my todo list.

Design answer: I like to think of the *-all commands as an abstraction that lets me treat a collection of terraform components as a single terraform project. This gives me state isolation for components without having to write my own for loop and glue things together. cd foo && terragrunt output XXX breaks this abstraction.

Put another way: "why run terragrunt apply-all instead of going into the subfolder you care about and running terragrunt apply?" :P

A concrete scenario: I have a script that does cd foo && terragrunt output XXX. Someone refactors and moves the definition of XXX from foo to bar. My script has to know about that implementation detail. Why? I'd rather write terragrunt output-all XXX and let the system figure out where XXX happens to be defined.

Is my mental model for *-all incorrect, perhaps?

@brikis98
Copy link
Member

brikis98 commented Apr 27, 2017

I like to think of the *-all commands as an abstraction that lets me treat a collection of terraform components as a single terraform project. This gives me state isolation for components without having to write my own for loop and glue things together. cd foo && terragrunt output XXX breaks this abstraction.

This is a compelling reason, but it seems like it's equally well supported by a --terragrunt-ignore-dependency-errors style flag. Is there any way to get kitchen-terraform to pass such a flag?

why run terragrunt apply-all instead of going into the subfolder you care about and running terragrunt apply

I don't think that analogy works, as it's obvious why the xxx-all commands exist, and all of them exit with an error if any dependencies fail.

A concrete scenario: I have a script that does cd foo && terragrunt output XXX. Someone refactors and moves the definition of XXX from foo to bar. My script has to know about that implementation detail. Why? I'd rather write terragrunt output-all XXX and let the system figure out where XXX happens to be defined.

The fact that foo moved to bar is not an implementation detail, but a change in your public API. I don't think blindly papering over it is a good idea in all cases. In fact, it exposes you to a different error condition, where you accidentally add a second output named XXX in some other module (e.g. think of common output names like public_ip or id), and you pick up that output instead of the one you actually wanted. So in general, I'd actually strong recommend using cd foo && terragrunt output XXX over terragrunt output-all XXX.

Is my mental model for *-all incorrect, perhaps?

No, not at all. I agree that using the Terragrunt xxx-all commands to treat a set of modules as a single module is compelling. I think the only question is whether the default output-all behavior should be consistent with all the other xxx-all commands (that is, exit when a dependency hits an error) and perhaps modified via a flag, or if we should change the default for this one command. The latter is less appealing to me, as it would be an exception to the way the rest of the code works, so I'd prefer to enable the behavior you want with a flag. If kitchen-terraform can't handle that, it seems like an issue that should be fixed in kitchen-terraform.

@mrtyler
Copy link
Contributor Author

mrtyler commented Apr 27, 2017

I won't reply to all of your points here. I disagree with a few (which I'm happy to discuss if you're interested) but it seems we have consensus on the usefulness of --terragrunt-ignore-dependency-errors. That's what I was after, so I'll stop trying to persuade you :).

I'm fine with leaving the default behavior of output-all alone and adding --terragrunt-ignore-dependency-errors to the command line myself as needed. If changing output-all's default behavior turns out to be the right thing to do, I'm sure someone will come along and create an Issue requesting it ;.)

My next step is still to see if I can resolve this output-all XXX problem from the kitchen-terraform side. Depending on how that goes, I'll then look at implementing --terragrunt-ignore-dependency-errors.

Thanks!

@brikis98 brikis98 added enhancement New feature or request help wanted labels Apr 27, 2017
@brikis98
Copy link
Member

Sounds great, thanks!

@mrtyler
Copy link
Contributor Author

mrtyler commented May 2, 2017

After studying the kitchen-terraform code, I don't think it's feasible to fix this problem from that side. I got my hopes up after finding newcontext-oss/kitchen-terraform#113. While that PR does make the situation better, my take (as an admittedly poor Rubyist) is that the Kitchen architecture isn't going to let me look up all the Terraform output variables once and then share that info with everything that needs it. (Put another way, there's no way to prevent kitchen from needing to do output XXX at some point.)

Aaron's comment indicates some re-design in their upcoming 1.0.0 release so maybe the situation will be different later.

In the meantime, I'm going to look at adding this flag.

@mrtyler
Copy link
Contributor Author

mrtyler commented May 4, 2017

Filed #209. Using --terragrunt-ignore-dependency-errors I was able to use kitchen-terraform to test my component stack!

Bonus note: even though I'm a Go noob, I found it straightforward to add this feature thanks to terragrunt's sensible organization and excellent collection of tests. Good job, team! 🎉

@brikis98
Copy link
Member

brikis98 commented May 4, 2017

Fixed with #209

@brikis98 brikis98 closed this as completed May 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants