Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any benefit of using separate remote state files? #3838

Closed
vasiliyb opened this issue Nov 10, 2015 · 13 comments
Closed

Any benefit of using separate remote state files? #3838

vasiliyb opened this issue Nov 10, 2015 · 13 comments
Labels

Comments

@vasiliyb
Copy link

Folks,
When initially implementing terraform, we have potentially made a mistake, and have setup several separate projects with terraform, which in turn manage the same AWS Account's infrastructure. What we have ended up with is an S3 bucket full of separate remote state files.
Now we face the dilemma of sharing things between these terraform setups. So this leads me to the questions:
1.) Combining JSON from all state files should probably give us a way out. Am I correct that after merging the code base of .tf files along with combining the remote state files into one should get us on to the same code base?

2.) Is there any benefit of keeping the current setup?

Thanks! Hope this is a good venue to ask such question, as I have not found much information regarding this anywhere in the docs. Would love to update with my findings!

@apparentlymart
Copy link
Member

@vasiliyb this is a great question but one that I think is rather subjective: what will work best for you will depend greatly on the scale and structure of what you're deploying. However, I can describe the pattern used by my team at work for managing a (somewhat-)complex infrastructure as a set of separate Terraform states.

The terraform_remote_state resource allows the state produced by one config to be consumed by another, "downstream" configuration. Using this mechanism we create a layered structure as follows:

terraform-state-tree

Our architecture is such that there are many distinct applications, each of which controls most of its own infrastructure, and then these apps join "environments" (PROD and QA in this example) to gain access to shared infrastructure that allows them to communicate with one another, such as networks (VPC, subnets) and a Consul cluster.

Each of the boxes in the diagram is a Terraform config and an associated remote state. The states for the top two ranks are stored in S3, and then the states for the bottom two ranks are stored within the Consul cluster within each environment (each of which is created by the configs on the second rank.)

So we have many different distinct configs and Terraform states, but they all belong to one of the following categories:

  • Global (exactly one; just contains some IAM resources that we use for deployment/management across the whole infrastructure)
  • Environment Region (one per AWS region per environment)
  • Environment Global (one per environment; creates a Route53 zone for each environment and some other global things)
  • Environment Availability Zone (one per AWS AZ per environment; these actually just portion out resources created by the Environment Region configs in a per-AZ structure for app convenience, and don't create any new resources of their own -- I'd love to replace them with what I proposed in terraform_synthetic_state resource #3164)
  • Application (one config per application, but with a separate remote state for each environment so that each environment has its own distinct copy; the app config is free to use any combination of the states in each environment on the previous rank to deploy in a multi-AZ, multi-region manner)

This structure has been working well for us so far, and allows us to manage the lifecycles of different parts of the system separately. We deploy our apps a lot more often than we modify the different infrastructure configs, and we deploy them as a single unit so that the app gets updated everywhere in one action. The infrastructure configs are more separated because we tend to update them less often, with more care, one region at a time so that we can't break all of our infrastructure in a single step.

However, our environment is relatively complex with many separately-maintained applications all interacting via shared infrastructure in multiple environments. I expect this would be overkill for a single, relatively-simple application where a standalone Terraform configuration could be more than sufficient.

Ultimately you'll need to trade off the complexity of coordinating changes across many different configs against the flexibility of being able to evolve each subsystem separately.


If you do decide to collapse all of your resources into a single Terraform config, you had the right idea about combining the state files together. You'll need to watch out for any collisions in the local identifiers of resources in different configs (e.g. if you have two configs that both declare resource "aws_instance" "main"), but otherwise it should work.

You could also elect to create a wrapper module around your existing modules. In that case you'll have a new module whose entire contents are module "foo" { ... } blocks referring to the others. You'll still need to merge the states together into one, but this time you'll create a different module structure within the state file for each originally-distinct configuration. Since the namespace for resource identifiers is local to each module, you won't need to rename anything in this case. You can use module variables and outputs to pass data between related modules within your wrapper module, so you can create a graph of modules in a similar vein to my example above except that Terraform will apply changes to the entire graph in a single action, rather than managing each part separately.


I'm sorry that this turned out to be a bit of an essay, but I hope some of it is useful. I'm curious to hear about what path you'll ultimately decide to take, and what characteristics of your problem led you to that decision.

@vasiliyb
Copy link
Author

@apparentlymart best write up. Thank you!

Question in the following scenario... A new deployment machine can only be configured with one remote terraform state file (either vpn or redis remote state), with this command:

terraform remote config -backend=s3 ........ -backend-config="key=redis.tfstate"

or

terraform remote config -backend=s3 ..... -backend-config="key=redis.tfstate"

Once this command is ran, it will download the state file from s3, and place it in .terraform directory locally, which will either have the vpn or redis state.

BUT the problem I can not understand is, there are several terraform resources defined in our the codebase:

resource "terraform_remote_state" "redis" {
    backend = "s3"
    config = {
        bucket = "terraform-remote-state"
        key= "vpn.tfstate"
        region = "us-east-1"
    }
}
resource "terraform_remote_state" "vpn" {
    backend = "s3"
    config = {
        bucket = "terraform-remote-state"
        key= "vpn.tfstate"
        region = "us-east-1"
    }
}

How does terraform decide where to stick the state into for the related playbook? Are the terraform_remote_state resources tied to a .tf file?

@stack72
Copy link
Contributor

stack72 commented Nov 30, 2015

@apparentlymart I would love to see this turned into a wiki article / blog post or something so that it can be added to - thoughts?

@vancluever
Copy link
Contributor

Glad I stumbled upon this. The thing is, I am actually trying to use a single .tf file with multiple states, in a single repo (and get some tooling around how to do it).

I found that I can do it by:

terraform remote config -disable -pull=false && rm -f terraform.tfstate
terraform remote config -backend=consul \
    -backend-config="address=demo.consul.io:80" \
    -backend-config="path=org/project/tf/ENV1"

Would be nice to have a terraform remote change or maybe a -migrate|push=false option on config to ensure the .tfstate file is not migrated even if it's present, to ensure any state is not pushed. Remote state could just be reconfigured and pulled, overwriting local state.

Then tooling can just switch to the remote state that it needs, pull it down, and be on its way.

@vasiliyb from what I've seen, state is not a part of config, it's managed separate and remote state is a bit of a "state" itself (the remote key needs to be in local cache in .terrafrom/terraform.tfstate so that terraform knows where to grab the state from). I was a little disappointed to see that it wasn't in the main config, but it's able to be worked around easy enough as you've seen, and based on what I've pasted above. It's also debatable on if it should be a part of main config.

As for terraform_remote_state, that's actually for grabbing values (like outputs) from other states for use in other configs (kind of like nesting templates in CloudFormation, if you've ever used that). See https://terraform.io/docs/state/remote.html for a good example.

@apparentlymart
Copy link
Member

@vasiliyb terraform_remote_state is a read-only thing, which is intended to allow you to obtain the outputs produced from a different Terraform config. So in my example, the "Global" config at the top has its state set up using terraform remote config like you said, with the result getting written to S3. The configurations on the second rank -- for example, "PROD us-west-1" -- then use the terraform_remote_state resource to read what was created by the "Global" config. They each in turn get published to separate S3 keys, again using terraform remote config on each of them. In this manner we are able to create a tree of dependent configurations, with each layer getting read-only access to the resources published by the layer above it.

@stack72 writing up how we're using Terraform in my blog -- including the details above -- has been on my to-do list for a while but I probably won't get to it until next year now, since my schedule is packed over the holiday period. 😞

@vancluever we actually do what you're describing for the app-level configs (the bottom rank of my diagram above) to allow them to be separately deployed to each environment. We currently have some hacky wrapper scripts that do something like the terraform remote change thing you're talking about. Over in #2824, a similar script to ours was posted by someone else.

@shpsec-dennis
Copy link

Did this bug ever see the light of day in the form of a blog? Real world examples are always useful.

@charity
Copy link

charity commented Mar 30, 2016

Ran across this while I was refactoring my own state files, so here's a link to my blog post on managing multiple states / migrating to a tfstate file per environment. http://charity.wtf/2016/03/30/terraform-vpc-and-why-you-want-a-tfstate-file-per-env/ cc @shpsec-dennis

@lestephane
Copy link

@apparentlymart I'd love to read a more detailed description of your setup. Will you be writing a longer blog article?

@apparentlymart
Copy link
Member

I'm sorry I never followed up with a full article on this. Things actually evolved a little since I wrote this, so I have some new stuff to share but my time is limited right now.

What Charity described is largely the same, differing a little in the details. The only part not shown there is how apps get deployed with Terraform, and these days I think many folks are finding their use case is simple enough to just use Nomad (or similar) for that and not bring Terraform into play at that layer. Terraform can still be useful in this space if you have an environment where cloud services are wired up to apps running on computed resources, since it is good at connecting the dots to help these things talk together, but you can get a long way with Terraform writing resource attributes into Consul and Nomad reading Consul to configure job settings.

@apparentlymart
Copy link
Member

I (finally) wrote some stuff about a more generalized pattern that is based on what I learned designing the architecture for the company I was at when I wrote the above. It's a bit wordy but is hopefully more useful than the specific details of that particular setup:
http://apparently.me.uk/terraform-environment-application-pattern/

I hope it's useful to others trying to model more complex systems in Terraform.

@toddmichael
Copy link

@apparentlymart really appreciate your Terraform EA pattern document. I notice it was published about a month or so before Terraform 0.9 was released w/ overhauled remote state management. Does 0.9 cause you to rethink anything in that document or does it still pretty much hold up? Thanks again for the great work on this important topic. Cheers.

@apparentlymart
Copy link
Member

The "State Environments" feature for 0.9 was in progress while I was writing that and so I was somewhat aware of the design of the new feature when I wrote it and watched out for anything that wouldn't work in the new model. I actually -- at the zeroth hour before publishing -- switched parts of it to use terraform env instead of terraform remote config, so what's described there is already using this new feature in a shallow way.

As far as I know, the high-level approach remains sound. One adjustment that could be made for 0.9 is to use ${terraform.env} to avoid having a separate variable for the "target environment", assuming you were doing something like the "Environment Domain" bonus pattern.

One current limitation of Terraform's State Environments feature is that it requires that the state for all environments must live inside the same backend. This is contrary to the pattern that many users independently invented before there was first-class support, where perhaps each environment's states were kept in a separate S3 bucket or in a separate AWS account entirely. However, anyone building a fresh system should be able to design it around the current constraints -- for example, putting all the states in a single S3 bucket and using IAM policies to restrict access -- and get a working solution.

@ghost
Copy link

ghost commented Apr 13, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@hashicorp hashicorp locked and limited conversation to collaborators Apr 13, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

8 participants