Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for terraform workspaces? #1581

Open
marco-m-pix4d opened this issue Mar 11, 2021 · 21 comments
Open

Support for terraform workspaces? #1581

marco-m-pix4d opened this issue Mar 11, 2021 · 21 comments
Labels
needs design We need to flesh out the design before we can resolve the issue

Comments

@marco-m-pix4d
Copy link
Contributor

Hello,

I am evaluating terragrunt, since we have two terraform root modules with a dependency between the twos and we feel the need for something less error prone than performing terraform operations by hand in each directory.

We rely heavily on terraform workspaces and to my understanding terragrunt does not support them. I tend to agree that workspaces are probably not a good idea to start with, but I am trying to understand if there is a sort of migration path for us or if to use terragrunt we have to revolutionize our workflows and terraform configurations.

So I have two questions:

  1. Is there any way / workarounds to use terragrunt with terraform workspaces?
  2. If not, would you be willing to consider a PR that adds minimal workspace support? I see it as a command-line option terragrunt --workspace=foo that would set workspace foo in each root module before performing further operations.

thanks!

@yorinasub17
Copy link
Contributor

terragrunt does not support them

While terragrunt doesn't have native support for workspaces, there is nothing barring you from using them with terragrunt. E.g., you can run terragrunt workspace select dev to switch workspaces prior.

You will need to disable auto-init to make this work though as you will lose the workspace settings anytime the terragrunt-cache folder is lost. Similarly, --terragrunt-source-update needs to be used with caution. Which brings me to...

Is there any way / workarounds to use terragrunt with terraform workspaces?

You can configure a before/after hook to handle this, in conjunction with get_env. The idea would be to define some environment variable (e.g. TG_WORKSPACE) that denotes what workspace terragrunt should use, and have it always switch workspaces prior to calling terraform. E.g.:

terraform {
  before_hook "workspace" {
    commands = ["plan", "state", "apply", "destroy", "init"]
    execute = ["terraform", "workspace", "select", get_env("TG_WORKSPACE")]
  }
}

This should workaround the limitations pointed above, where the workspace settings get wiped out, as terragrunt is actively switching workspaces prior to running terraform.

If not, would you be willing to consider a PR that adds minimal workspace support?

We would be open to PR that adds better support for workspaces, but I think it needs a bit more design thought. E.g., for the proposal you have, there are a few questions that pop to mind:

  • Is the --workspace flag any different from running terragrunt workspace select dev && terragrunt plan?
  • What happens if one runs terragrunt --workspace dev plan and then terragrunt plan? Should it use the dev workspace or default?
  • When is the workspace selection happening? Prior to calling to every call to terraform? only the command that it is going to run? Or should it also do it for auto-init?

Given that, it would be useful to see an RFC that walks through the workflow with a few examples to understand how this would work before seeing some code.

@marco-m-pix4d
Copy link
Contributor Author

Hello @yorinasub17, I wish I got always this quality of answers when I ask a question :-)
You not only give a workaround, you also explain the inner workings and the limitations.

Regarding "adding better support for workspaces", yes, I agree, going through an RFC makes sense, talking before coding is something I appreciate for sure.

Thanks again. I need to think and experiment; I will come back if we decide to go with terragrunt.

@VictorMCL
Copy link

VictorMCL commented Jul 8, 2021

terragrunt does not support them

While terragrunt doesn't have native support for workspaces, there is nothing barring you from using them with terragrunt. E.g., you can run terragrunt workspace select dev to switch workspaces prior.

You will need to disable auto-init to make this work though as you will lose the workspace settings anytime the terragrunt-cache folder is lost. Similarly, --terragrunt-source-update needs to be used with caution. Which brings me to...

Is there any way / workarounds to use terragrunt with terraform workspaces?

You can configure a before/after hook to handle this, in conjunction with get_env. The idea would be to define some environment variable (e.g. TG_WORKSPACE) that denotes what workspace terragrunt should use, and have it always switch workspaces prior to calling terraform. E.g.:

terraform {
  before_hook "workspace" {
    commands = ["plan", "state", "apply", "destroy", "init"]
    execute = ["terraform", "workspace", "select", get_env("TG_WORKSPACE")]
  }
}

This should workaround the limitations pointed above, where the workspace settings get wiped out, as terragrunt is actively switching workspaces prior to running terraform.

If not, would you be willing to consider a PR that adds minimal workspace support?

We would be open to PR that adds better support for workspaces, but I think it needs a bit more design thought. E.g., for the proposal you have, there are a few questions that pop to mind:

  • Is the --workspace flag any different from running terragrunt workspace select dev && terragrunt plan?
  • What happens if one runs terragrunt --workspace dev plan and then terragrunt plan? Should it use the dev workspace or default?
  • When is the workspace selection happening? Prior to calling to every call to terraform? only the command that it is going to run? Or should it also do it for auto-init?

Given that, it would be useful to see an RFC that walks through the workflow with a few examples to understand how this would work before seeing some code.

Hello @yorinasub17, I tried this way and it worked.

before_hook "set_workspace" {
  commands = ["plan", "state", "apply", "destroy"]
  execute = ["terraform", "workspace", "select", run_cmd("terraform", "workspace", "show")]
}

but there is a problem, It is possible that a dependency block has a workspace different from the block that calls it, this can trigger a serious error.

A -> B
A = Dev
B = QA

momentarily, you should create a script to update the workspaces when one block has dependency on another.

@yorinasub17
Copy link
Contributor

I believe you can get this to do the right thing if you add output to the commands in the hook. The dependency fetcher routine considers output before hooks when it extracts the outputs.

Note that in this model, you won't be able to use the dependency optimization, so you will need to make sure to set disable_dependency_optimization = true on the remote_state block.

@infraredgirl infraredgirl added needs design We need to flesh out the design before we can resolve the issue and removed needs-design labels Oct 20, 2021
@pkleanthous
Copy link

Hi there,

I'm trying to use the following suggestion:

terraform {
  before_hook "workspace" {
    commands = ["plan", "state", "apply", "destroy", "init"]
    execute = ["terraform", "workspace", "select", get_env("TG_WORKSPACE")]
  }
}

But if the workspace does not exist the command is failing silently and you are on the wrong workspace.
Is there a solution for that?
Sometihng like Select a workspace and if does not exist create it.

@giladdi-tr
Copy link

@pkleanthous, did you find any solution?

@VictorMCL
Copy link

Hi everyone,

in my case, I created a script that initializes and manages a common file to store the workspace for all modules, this uses the same way of operating terraform. Whenever a Terragrunt command is executed, the script validates in which workspace the common file is, and then executes the Terragrunt command, example: plan, apply or any other. I hope it helps you, @pkleanthous and @giladdi-tr

@giladdi-tr
Copy link

giladdi-tr commented Dec 16, 2021

I created the following solution, it is not the prettiest but it is working:

terraform {
  before_hook "select workspace" {
    commands = ["plan", "state", "apply", "destroy", "init", "refresh"]
    execute = ["./workspace-hook.sh", get_env("TG_WORKSPACE")]
  }
}

./workspace-hook.sh

#!/bin/bash
terraform workspace select $1 2>/dev/null || terraform workspace new $1

@pkleanthous @VictorMCL

@yorinasub17, do you think we their a better solution?

@VictorMCL
Copy link

This is a good start, keep in mind that when you have two or more modules, for example, a VPC and an EC2 you must synchronize both with the same workspace otherwise this can be dangerous. For this reason, I suggest creating a file in the root of the project that saves the state of the workspace.

You can use the same script to save the states of the workspaces every time you run the workspace select command with terragrunt and use that state when you run commands like a plan, apply, delete...

@giladdi-tr

@giladdi-tr
Copy link

I didn't fully understand what do you mean. Would you mind sharing an example? @VictorMCL

@GerardoGR
Copy link

Thanks for sharing your snippet @giladdi-tr it helped me to get started with the workspaces in Terragrunt. After some more fiddling around, I found out that actually Terraform supports a TF_WORKSPACE environment variable that selects (or creates if it doesn't exist) the terraform workspace. The only caveat, is that when executing the init over a new TF_WORKSPACE that doesn't exist, terraform will prompt the following:

The currently selected workspace ($WORKSPACE_NAME) does not exist.
This is expected behavior when the selected workspace did not have an
existing non-empty state. Please enter a number to select a workspace:

  1. default
    # ...Other workspaces here...
    Enter a value:

To work around that I disabled terragrunt's auto-init and manually executed terragrunt init with the TF_WORKSPACE variable set to nothing, for example, a "complete" execution looks like TF_WORKSPACE= terragrunt init && terragrunt apply. Also important to note is that I tried to set this TF_WORKSPACE to nothing via a before_hook only for the init command in terragrunt, it sounded like the natural place for such workaround, but based on the terragrunt hooks proposal issue, it seems that the hooks do not alter the environment of the terraform command.

And regarding the original issue discussed here, I wonder if terraform's TF_WORKSPACE environment variable already covers the functionality requested (CC: @yorinasub17 and @marco-m-pix4d).

@marco-m-pix4d
Copy link
Contributor Author

@GerardoGR what we settled on was scripting the sequence

  • terraform workspace select <foo>
  • terraform <command>

for all the terraform commands one normally uses, driven by Task (like make, but it has a saner syntax).

For example:

$ task apply -- foo

runs terraform workspace select foo && terraform apply for all the terraform root environments (root modules) we have.

We did try TF_WORKSPACE before, but we where concerned that the operation was transient and somehow less visible.

@thetimbecker
Copy link

I have had some success with a modified version of the script @giladdi-tr wrote. It's even less pretty but solved an issue I was running into where Terraform wasn't initialized yet:

terraform workspace select $1 2>/dev/null || (terraform init && terraform workspace new $1)

Then I ran into another issue in a dependent module. The module I was running was new, the one it depended on had an existing dev workspace, but Terraform wasn't initialized there yet. So now I have:

terraform workspace select $1 2>/dev/null || (terraform init && terraform workspace select $1 2>/dev/null) || terraform workspace new $1

This way Terraform is initialized if necessary, but not every time. It's a little cumbersome but has worked so far. We will see how it scales. I may add something to hide the terraform init output.

@VictorMCL I am also not sure what you mean. Since the workspace is controlled by an environment variable, you will always be referencing the same workspace in dependent modules.

(FWIW, I also use get_env("TG_WORKSPACE", "dev") so that by default we are always pointing at dev)

@spacex
Copy link

spacex commented Apr 26, 2022

As much as scripting sequences and using hooks works, wouldn't a tighter integration work more completely? It would need to take into account several use cases, as referenced in @yorinasub17's initial response.

Proposition:

  • add --workspace <workspace> CLI flag
    • If specified, will run terraform workspace select <workspace> before all terraform commands, just prior to running terraform
      • If it fails to select the workspace, then the command should fail, rather than default
    • No interaction with auto-init. Will select workspaces regardless of auto-init setting
    • If not specified, then do not change workspaces at all, leaving it at whichever workspace it is already at
    • Overrides workspace variables in terraform blocks if supplied
  • add --create-workspace CLI flag
    • Alters the workspace not existing failure mode of --workspace to instead create the workspace if it doesn't exist prior to selecting
    • Defaults to false, changes to true when supplied
    • Overrides create-workspace variables in terraform blocks if supplied
  • Add workspace variable for terraform configuration blocks
    • Mirrors --workspace CLI flag, but localized to just the specific module
    • String value, no default value, does nothing if not specified
    • Allows for different workspaces to be used in different modules if needed
  • Add 'create-workspace` variable for terraform configuration blocks
    • Mirrors --create-workspace CLI flag, but localized to just the specific module
    • Valid values: true/false, Default: false
  • Add 'workspace-override` variable for terraform configuration blocks
    • Valid values: true/false, Default: false
    • Causes the workspace specified in the terraform block to override a workspace passed in via CLI
    • Allows for different modules to override a 'workspace' CLI flag passed if needed
  • (Optional) Add workspace_terraform_commands variable for terraform configuration blocks
    • Similar to mock_output_allowed_terraform_commands
    • Valid values: (list of terraform commands), Default: (all except init)
    • Only change/apply workspace for specific listed terraform commands
  • (Optional) Add environment variable used as default workspace.
    • Acts as if --workspace is specified on the command line.
    • Would be overridden by --workspace specified on the command line

Any other use cases that this wouldn't cover?

@villers
Copy link

villers commented May 7, 2022

it would be cool to be able to configure the vars file according to the name of the workspace

@jakubigla
Copy link

jakubigla commented Sep 25, 2022

What we need for sure at the moment is a function like get_terraform_workspace_name(). I would love to pass its returned value to some variable of my module, which could drive the names of my cloud resources. It's a major pain point for me at the moment.

@timblaktu
Copy link

timblaktu commented Oct 24, 2022

@jakubigla sounds like you're looking for terraform workspace show. And since terragrunt wraps all terraform built-in commands, terragrunt workspace show should work also.

I came here after migrating a make/docker/bash/terraform project to finally leverage terragrunt to solve the myriad problems one has with terraform-only, and I'm dismayed to discover that workspaces are a second-class citizen. I use workspaces heavily, but their use is homogeneous and follows two usage patterns:

  1. Workspaces are used mostly to isolate otherwise identical "environments" which have the same configuration/architecture. So ${terraform.workspace} is embedded into names to make inter-workspace resources unique and intra-workspace resources obviously associated.
  2. In some cases ${terraform.workspace} may be used to conditionally select variables file(s) or set a resource count.

So after reading this thread I'm wondering what the "best" solution is for me right now. It seems like it's:

  1. Disable auto-init (TERRAGRUNT_AUTO_INIT or --terragrunt-no-auto-init) and make sure to call terragrunt init as needed. I'm not too worries about this because:
    1. I haven't been using TG so I don't really know yet what I'm missing with auto-init.
    2. All the caveats at the bottom of the auto-init feature page don't build a lot of trust in me.
  2. Ensure TF_WORKSPACE is set judiciously in shells and subshells, and that terragrunt workspace new is called appropriately.

I can't fathom how a terragrunt hook that terraform workspace selects could be superior to setting TF_WORKSPACE, which does the same thing, only more elegantly. The underlying terraform call checks this variable early and "make it so" as if terraform workspace select had been called.

..Wow that was fast. Actually, I already can fathom when the hook approach would be better - in the case mentioned above where different root modules in your terragrunt dependency graph need to run in different workspaces. That sounds like a nasty problem to have, and I think I might have it... I'm using terragrunt to layer terraform root modules that have different scope, e.g. going bottom-up, per-account-stuff is a layer, per-region-stuff builds a layer on top, and above there are root modules that use/depend on the lower layers. So the way I'm using them, workspaces only have meaning/utility in the upper layers, where I would/could "cookie-cutter" the infra declared in those upper root modules to provide some level of isolation between test/dev/demo/prototype environments by using the "test", "dev", "demo", or "prototype" terraform workspace. But for the account-wide or regional lower layers, the resources defined in them are shared "singletons" so there's less or no need for me to use workspaces to separate them. So, if I were to just set TF_WORKSPACE for each operation, and let all the various root module layers use the same workspace, which workspace should manage the tf state of the shared singletons residing at the account or regional layer? It would seem that indeed in my case, I need to have my upper layers use "test", "dev", "demo", etc.. but for the lower layers holding singleton resources, I would want these to be managed in a single workspace (probably default).

**I suppose in my case, where I actually WANT to be using different workspaces in different root modules in my terragrunt hierarchy.. it would seem I have no option but to use the before_hook method of setting/customizing workspace used in a root module, since I don't know of a way to set TF_WORKSPACE differently in each of them, and as @GerardoGR says,

hooks do not alter the environment of the terraform command

So calling terraform workspace select/new from the before_hooks in your modules' terragrunt.hcl is apparently THE ONLY WAY to customize the workspace(s) they use.**

@pkleanthous
Copy link

I have ended up using the before & after hooks:

  before_hook "workspace" {
    commands = ["plan", "state", "apply", "destroy", "refresh"]
    execute  = ["./terragrunt-hook-workspace.sh", local.workspace]
  }

  after_hook "destroy" {
    commands = ["destroy"]
    execute  = ["./terragrunt-hook-destroy.sh", local.workspace]
  }

terragrunt-hook-workspace.sh

#!/bin/bash

echo "[INFO] Switching to Terraform workspace: $1"
terraform workspace select "$1" 2>/dev/null || terraform workspace new "$1"

terragrunt-hook-destroy.sh

#!/bin/bash
# configure debug
set -xe

if [ "$1" != "default" ]; then
  echo "[INFO] Deleting Terraform workspace: $1"
  terraform workspace select default
  terraform workspace delete "$1"
fi

@j-p-c-l
Copy link

j-p-c-l commented Nov 17, 2022

Hi all,
Trying this out for this first time (and going the TG_WORKSPACE way, as in my case all terraform I need to execute would be consistently using the same workspace name), and I'm wondering two things, maybe I missed a detail of the documentation:

  • Is there a downside of doing an sh/bash one-liner on the execute command, instead of calling one external file, i.e:
  before_hook "workspace" {
    commands = ["plan", "state", "apply", "destroy", "refresh"]
    execute = ["sh", "-c", "terraform workspace select $TG_WORKSPACE || terraform workspace new $TG_WORKSPACE"]
  }

I have the feeling the above would avoid the need for additional external files.
And a different question, for my understanding:

  • In all the examples along the thread, I always see being executed a "file in the current directory" ( i.e. "execute = ["./terragrunt-hook...sh" ) . I cannot imagine your local layout ( I'm assuming you have a tree of resources ): would this sh file exist in every directory where you have a terragrunt-managed terraform code? I imagine not, because it would not be DRY, but If yes, how do you get that sh file multiplied everywhere you would need it ? The closest I could get on my setup, to avoid replication, was to point to an absolute path. Also because I could not exec something in my "home folder" (~). So I had to set something like this:
before_hook "workspace" {
    commands = ["plan", "state", "apply", "destroy", "refresh"]
    execute = ["/home/my_user/terragrunt-hook-destroy.sh", get_env("TG_WORKSPACE") ]
  }

What is the magic on your filesystem layout, or terragrunt configuration, that makes it sucessfull to refer to a local file ?
Thank you

@ahummel25
Copy link

ahummel25 commented Apr 3, 2023

Do any of the above workspace select/show/new hooks work when you are deploying to separate accounts? Say you have a dev stack going to account A and a prod stack going to account B.

I get this error when trying any of the above mentioned workspace before_hook solutions:

Error: Backend initialization required: please run "terraform init"

The way I've gotten this to work is to do the following (steps 1, 2, and 3 are scripted):

  1. Set AWS_PROFILE via a custom shell script to point to account A or B.
  2. Set TF_WORKSPACE to dev or prod depending on which account (A or B) I'll be deploying to.
  3. Set TERRAGRUNT_AUTO_INIT to false.
  4. Have this before_hook in my root terragrunt.hcl:
terraform {
  before_hook "init_reconfigure" {
    commands = ["apply", "destroy", "plan", "refresh", "state"]
    execute  = ["terraform", "init", "-reconfigure"]
  }
}

This works for a multi-account strategy but wasn't sure if anyone had come across a better solution.

@corinz
Copy link

corinz commented Aug 23, 2023

Are there any plans to turn this into a natively supported terragrunt feature?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs design We need to flesh out the design before we can resolve the issue
Projects
None yet
Development

No branches or pull requests