Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 Backend & Workspaces #15358

Closed
rowleyaj opened this Issue Jun 21, 2017 · 20 comments

Comments

Projects
None yet
8 participants
@rowleyaj
Copy link
Contributor

rowleyaj commented Jun 21, 2017

Terraform Version

Terraform v0.10.0-dev (c10f5ca)

Terraform Configuration Files

terraform {
  required_version = "~> 0.10"

  backend "s3" {
    bucket  = "rowleyaj-tf-state-demo"
    key     = "v1/terraform-remote-state-example"
    region  = "us-east-1"
  }
}

resource "null_resource" "sleep" {
  provisioner "local-exec" {
    command = "sleep 10"
  }
}

Expected Behavior

Terraform stores the state file in S3 under the key specified: v1/terraform-remote-state-example with the workspace handling done within the namespace of the key - effectively objects would have the workspace suffixed.

Actual Behavior

When using workspaces Terraform stores the state at the root of the bucket. Inconsistently (for backwards compatibility?) the default workspace doesn't follow this pattern.

The S3 browser and a local file sync also treat the : in env: as a directory separator.

I've synced the objects from the bucket locally so I could use tree to illustrate the keys in a folder format.

$ aws s3 sync s3://rowleyaj-tf-state-demo .
$ tree
.
├── env:
│   └── testing
│       └── v1
│           └── terraform-remote-state-example # testing workspace
└── v1
    └── terraform-remote-state-example # default workspace

Steps to Reproduce

  1. Configure S3 remote state
  2. terraform init
  3. terraform apply
  4. terraform workspace new testing
  5. terraform apply
  6. Check structure on S3.

References

Are there any other GitHub issues (open or closed) or Pull Requests that should be linked here? For example:
#14943
#13184

@apparentlymart

This comment has been minimized.

Copy link
Contributor

apparentlymart commented Jun 21, 2017

Hi @rowleyaj! Thanks for opening this issue.

I just want to confirm what I think you're asking for here: you'd prefer to see the workspace name be a suffix of the specified key, rather than a prefix as it seems to be now. Is that right?

Indeed there are some backward compatibility concerns here but we know the current behavior is not completely smooth so would definitely be open to finding a good compromise that makes the behavior more useful/intuitive.

@rowleyaj

This comment has been minimized.

Copy link
Contributor Author

rowleyaj commented Jun 22, 2017

@apparentlymart no worries on opening the issue, I debated for a while if I should or not as I wasn't sure if it was a feature or a bug.

I just want to confirm what I think you're asking for here: you'd prefer to see the workspace name be a suffix of the specified key, rather than a prefix as it seems to be now. Is that right?

Yes effectively this is what I was suggesting, although after looking into this more I think even this might not be the correct solution. Looking at the code I can understand why this is being done the way it is - using a prefix allows the workspaces to be listed out from the S3 API.

https://github.com/hashicorp/terraform/blob/master/backend/remote-state/s3/backend_state.go#L26

From a usability stand point based on the current documenation, I do believe this is a bug rather than an enhancement request. The docs for the S3 backend state:

key - (Required) The path to the state file inside the bucket.

When using workspaces this is broken, the state file is no longer within that "path".

Indeed there are some backward compatibility concerns here but we know the current behavior is not completely smooth so would definitely be open to finding a good compromise that makes the behavior more useful/intuitive.

My understanding on this is as follows:

To address this issue I propose:

  • Document the current behaviour as noted above, perhaps with a warning on key in the documentation to show that when working with workspaces key won't act as it did previously
  • Address #13184 - as an enhancement expose keyEnvPrefix as workspace_key_prefix https://github.com/hashicorp/terraform/blob/master/backend/remote-state/s3/backend_state.go#L20 with the default set to the value of the constant as it is now.
  • Modify the documentation to better explain the combined effect of the two configurable values.

Do you think this would address these issues but maintain the backwards compatibility required?

@apparentlymart

This comment has been minimized.

Copy link
Contributor

apparentlymart commented Jun 22, 2017

I'm going to loop in @jbardin here since he was more deeply involved in the decision process that led us here. On the surface this seems right to me but he will likely have more context as to why things ended up how they did and whether there are other concerns that are not obvious.

@jbardin

This comment has been minimized.

Copy link
Contributor

jbardin commented Jun 22, 2017

Hi @rowleyaj,

Yes, I admit this is a little awkward. It was laid out for backward compatibility, but also because it followed the pattern used in the consul backend. Because of the different semantics of the storage layers, users seem to treat the consul path layout as an implementation detail while in s3 they want to actively manage the objects.

I tested a very extensive PR to try and change the layout at one point, but there are so many pitfalls in migrating the state transparently for users, we opted to leave it as-is and work around any naming issues (which since you've done your homework, you can see why things are the way they are 😉 )

I think this proposal is probably the best we can make from the situation (the documentation definitely should have been fixed already). My only concern is fully testing the integration with backend migrations, which are triggered anytime the backend config changes. While the 0.10 release would be a good time to get this in, I don't want to drop in a change without running through all those scenarios.

Thanks!

@rowleyaj

This comment has been minimized.

Copy link
Contributor Author

rowleyaj commented Jun 28, 2017

@jbardin are you good with me leaving this open for you to address the doc changes or would you like me to take a stab at that?

@jbardin

This comment has been minimized.

Copy link
Contributor

jbardin commented Jun 28, 2017

@rowleyaj: no problem, I can take care of that. thanks!

@fatgit

This comment has been minimized.

Copy link

fatgit commented Jul 31, 2017

@rowleyaj How can I create new workspace with copying existing state like with local state?
terraform workspace new -state=path testing

I tried to do next workaround
terraform state pull > s3.tfstate

terraform workspace new -state=./s3.tfstate testing
and got

Created and switched to workspace "testing"!

You're now on a new, empty workspace. Workspaces isolate their state,
so if you run "terraform plan" Terraform will not see any existing state
for this configuration.
Acquiring state lock. This may take a few moments...
incompatible state lineage; given 9dec780b-2b5a-402c-acc9-f97eab09a152 but want dd2fdd08-58dd-4f33-bad4-d54f73140bed

@jbardin

This comment has been minimized.

Copy link
Contributor

jbardin commented Jul 31, 2017

Ho @fatgit,

Sorry about the new state issue. It that got caught by our strict lineage checks which were changed a bit recently. I filed an issue for it here: #15674

@tyrsius

This comment has been minimized.

Copy link

tyrsius commented Jul 31, 2017

Im brining this comment over from the closed issue #15654

Terraform handles the environment and s3 backend key in the wrong order. The docs say this for key

key - (Required) The path to the state file inside the bucket.

When using environments the key is not respected, and instead files are stored in the bucket under env:/{environment}/{s3key}. The literal "env:" (yes, with a colon) is confusing enough, but the fact that the key is the last thing used is just wrong. The key property is useful for isolating multiple applications/configurations in a single bucket, but when using environments everyone shares the "env:" folder as well as as the environment folder. This breaks isolation across applications/configurations.

If this is the intended behavior and not a bug, which I hope it is, then I would like to request a method for actually specifying the backend key that is used at the root, as the key property has historically done.

@tyrsius

This comment has been minimized.

Copy link

tyrsius commented Aug 1, 2017

@jbardin My original issue #15654 was closed and linked to this one, and then this one was closed. I don't really feel like the issue has been resolved.

@jbardin

This comment has been minimized.

Copy link
Contributor

jbardin commented Aug 1, 2017

Hi @tyrsius,

Sorry the resolution here isn't satisfactory. This issue covered the points you made in your original, and the PR that functionally closed this issue adds the workspace_key_prefix option, which let's you at least control the full path of the state objects in all cases.

As noted, the existing scheme was built on different assumptions than some users had, and exists in its current form for backwards compatibility. Migrating to a new naming scheme would either require an extensively tested migration, or a full break in backward compatibility. While neither of these options are completely ruled out, there would need to be sufficiently valuable use-case to offset the risk of either of those. If you want to make a case for inclusion in the next major release, feel free to continue the discussion in your original issue and we will gladly reopen and review it.

@tyrsius

This comment has been minimized.

Copy link

tyrsius commented Aug 1, 2017

if workspace_key_prefix has been accepted that works, it seemed like that had not yet happened.

@jbardin

This comment has been minimized.

Copy link
Contributor

jbardin commented Aug 1, 2017

@tyrsius: that was merged in #15370, and will be included in the next release.

@taiidani

This comment has been minimized.

Copy link

taiidani commented Mar 7, 2018

Sorry for bringing this issue back up @jbardin and I can open a new issue if more appropriate, but it doesn't look like the workspace_key_prefix entirely solves the issue?

From the documentation, it looks like

  • workspace=default, key=myproject.tfstate and workspace_key_prefix=teams/myteam would result in the key being stored at the S3 key myproject.tfstate.
  • workspace=production, key=myproject.tfstate and workspace_key_prefix=teams/myteam would result in the key being stored at the S3 key teams/myteam/production/myproject.tfstate.

I understand the need for backwards compatibility here, but for scenarios where workspaces are in use that causes a fairly negative side effect to the pathing in the bucket.

Is there any way to have all workspaces, even the default one, live under a consistent workspace_key_prefix? A suffixing option would work great but it sounds like the possibility for that has been ruled out.

@jbardin

This comment has been minimized.

Copy link
Contributor

jbardin commented Mar 7, 2018

Hi @taiidani,

Sorry this is still causing an issue. As you said, we need to maintain backwards compatibility, and unfortunately there isn't a single solution that satisfies everyone. Every option we add to change the layout of the remote state is complicated by the requirement that the terraform remote state data source must be able to locate the state from different versions of terraform.

Changing the s3 object layout may be an option for a future major release where we can break compatibility, but it's not something we can fit in the near term.

@taiidani

This comment has been minimized.

Copy link

taiidani commented Mar 7, 2018

Understood. We'll do our best to live with the behavior until that day comes. For now I'll try putting the prefix in the key so we can at least get our IAM policies against the prefix to work.

teams/myteam/myproject.tfstate
teams/myteam/staging/teams/myteam/myproject.tfstate
teams/myteam/production/teams/myteam/myproject.tfstate
@rowleyaj

This comment has been minimized.

Copy link
Contributor Author

rowleyaj commented Apr 30, 2018

@taiidani I would probably suggest you just use different buckets per team if it's a per team use case that you have. S3 buckets aren't a limited resource anymore like they once were - 100 Max.

That would allow you to use IAM policies that restrict a team to only their bucket, and don't require the duplication within the key and the prefix in the terraform configuration.

@zerolaser

This comment has been minimized.

Copy link

zerolaser commented May 9, 2018

@rowleyaj what is the scenario if we have different bucket for dev and productions environment how can we change the backend s3 bucket ?

@ddpalmer

This comment has been minimized.

Copy link

ddpalmer commented Jul 30, 2018

@zerolaser I have the same challenge. Each environment is in a separate AWS account. We have
a business requirement that each environment store the Terraform state in it's own account. I don't have a working solution to this yet. Perhaps someone has some tips & tricks?

@tyrsius

This comment has been minimized.

Copy link

tyrsius commented Jul 30, 2018

@ddpalmer We do this, its not too difficult. If you remove the bucket property from your backend config you can inject it from the CLI

terraform init \
       -backend-config bucket="${terraform_state_bucket}"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.