Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storing sensitive values in state files #516

Open
seanherron opened this issue Oct 28, 2014 · 183 comments
Open

Storing sensitive values in state files #516

seanherron opened this issue Oct 28, 2014 · 183 comments

Comments

@seanherron
Copy link
Contributor

#309 was the first change in Terraform that I could find that moved to store sensitive values in state files, in this case the password value for Amazon RDS. This was a bit of a surprise for me, as previously I've been sharing our state files publicly. I can't do that now, and feel pretty nervous about the idea of storing state files in version control at all (and definitely can't put them on github or anything).

If Terraform is going to store secrets, then some sort of field-level encryption should be built in as well. In the meantime, I'm going to change things around to use https://github.com/AGWA/git-crypt on sensitive files in my repos.

@bitglue
Copy link

bitglue commented Jan 28, 2015

See #874. I changed the RDS provider to store an SHA1 hash of the password.

That said, I'm not sure I'd agree that it's Terraform's responsibility to protect data in the state file. Things other than passwords can be sensitive: for example if I had a security group restricting SSH access to a particular set of hosts, I wouldn't want the world to know which IP they need to spoof to gain access. The state file can be protected orthogonally: you can not put it on github, you can put it in a private repo, you can use git-crypt, etc.

@kubek2k
Copy link
Contributor

kubek2k commented Jan 28, 2015

related #689

@dentarg
Copy link

dentarg commented Mar 17, 2015

Just want to give my opinion on this topic.

I do think Terraform should address this issue. I think it will increase the usefulness and ease of use of Terraform.

Some examples from other projects: Ansible has vaults, and on Travis CI you can encrypt informaton in the .travis.yml file.

@ketzacoatl
Copy link
Contributor

Ansible vaults is a feature I often want in other devops tools. Protecting these details is not as easy as protecting the state file.. what about using consul or Atlas as a remote/backend store?

+1 on this

@dayer4b
Copy link
Contributor

dayer4b commented May 28, 2015

I just want to point out that, according to official documentation, storing the state file in version control is a best practice:

https://www.terraform.io/intro/getting-started/build.html

Terraform also put some state into the terraform.tfstate file by default. This state file is extremely
important; it maps various resource metadata to actual resource IDs so that Terraform knows what
it is managing. This file must be saved and distributed to anyone who might run Terraform. We
recommend simply putting it into version control
, since it generally isn't too large.

(emphasis added)

Which means we really shouldn't have to worry about secrets popping up in there...

@hobbeswalsh
Copy link

👍 on this idea -- it would be enough for our case to allow configuration of server-side encryption for S3 buckets. Any thoughts on implementing that?

@apparentlymart
Copy link
Member

At the risk of adding scope to this discussion, I think another way to think of this is that Terraform's current architecture is based on a faulty assumption: Terraform assumes that all provider configuration is sensitive and that all resource configuration isn't sensitive. That is wrong in both directions:

  • Several resources now take passwords as inputs or produce secret values as outputs. In this issue we see the RDS password as one example. The potential Vault provider discussed in Vault provider #2221 is another example.
  • Several provider arguments are explicitly not sensitive, such as the AWS region name, and excluding them from the Terraform state results in Terraform having an incomplete picture of the world: it can see that there is an EC2 instance with the id i-12345 but it can't see what region that instance is in without help of the configuration. Changing the region on the AWS provider causes Terraform to lose track of all of the existing resources, because as far as the AWS provider is concerned they've all been apparently deleted.

So all of this is to say that I think overloading the provider/resource separation as a secret/non-secret separation is not the best design. Instead, it'd be nice to have a mechanism on both sides to distinguish between things that should live in the state and things that should not, so that e.g. generated secrets can be passed into provisioners but not retained in the state, and that the state can encode that a particular instance belongs to a particular AWS region and respond in a better way when the region changes.

There are of course a number of tricky cases in making this situation, which I'd love to explore some more. Here are some to start:

  • If you don't retain something in the state then it's not safe to interpolate it anywhere because future runs will assume they can interpolate attributes from existing resources in the state.
  • Some provider config changes effectively switch all resources to an entirely new "namespace", and thus effectively force every attached resource to be destroyed and recreated in the new region. The AWS region is one example, since AWS resources are region-specific. But that's not so simple for other arguments: the AWS access_key might change what Terraform has permission to interact with, but it doesn't change the id namespace that resources live within.

@little-arhat
Copy link

Hi, any progress on that? Terraform 0.6.3 still stores raw passwords in the state file. Also, as a related issue, if you do not want to keep passwords in configuration, you can create variable without default value. But, this will force you to pass this variable every time you run plan/apply, even if you're not going to change resource that has this password.

I think, it would be nice to separate sensitive stuff from other attributes, so it will:

  • be stored as sha1 or smth in state file
  • not require value if it already has one.

So, for configuration like:

variable db {
    password {}
}

resource ... {
    password = "${var.db.password}"
}

terraform will require variable for the first run, when it doesn't have anything, but will not require on subsequent runs.

To change such value one need to provide different value for password.

@EvanKrall
Copy link
Contributor

Maybe there's a simple solution: store the state in Vault?

@mwarkentin
Copy link
Contributor

A good solution for this would be useful for us as well - we're manually configuring certain things to keep them out of the tfstate file in the meantime.

@ascendantlogic
Copy link

So as I slowly cobble together another clean-sheet infra with Terraform I see this problem still exists, and this issue is almost exactly 1 year old. What is the thinking in regards to solving this? the ability to mark specific attributes within a resource as sensitive and storing SHA1 or SHA2 hashes of their values in the state for comparison? I see this comment on a related ticket, does that mean that using Vault will be the prescribed way? I get that it promotes product synergy but I'd really like a quick-n-dirty hashing solution as a fallback option if I'm honest.

@ketzacoatl
Copy link
Contributor

Moving secrets to vault, and using consul-template or integration with other custom solutions you have for CM certainly helps for a lot of cases, but completely avoiding secrets in TF or ending up in TF state is not always reasonable.

@ascendantlogic
Copy link

Sure, in this particular case I don't want to manually manage RDS but I don't want the PW in the state in cleartext, regardless of where I keep it. I'm sure this is a somewhat common issue. Maybe an overarching ability to mark arbitrary attributes as sensitive is shooting for the moon but a good start would be anything that is obviously sensitive, such as passwords.

@jfuechsl
Copy link

Would it be feasible to open up state handling to plugins?
The standard could be to store it in files, like it is currently done.
Other options could be Vault, S3, Atlas, etc.

That way this issue can be dealt with appropriately based on the use-case.

@brikis98
Copy link
Contributor

I just got tripped up by this as well, as the docs explicitly tell you to store .tfstate files in version control, which is problematic if passwords and other secrets end up in the .tfstate files. At the bare minimum, the docs should be updated with a massive warning about this. Beyond that, there seem to be a few options:

  1. Offer some way to mark variables as secret and either ensure they never get stored in .tfstate files or store them in a hashed form.
  2. Encrypt the entire .tfstate file.
  3. Remove the recommendation to store .tfstate files in version control and only recommend them to be stored in secure, preferably encrypted storage.

@ejoubaud
Copy link

ejoubaud commented Jan 5, 2016

One thing to consider around this is output. When you create a resource with secrets (key pair, access keys, db password, etc.), you likely want to show the secret in question at least once (possibly in the stdout of the first run, as output do)

Currently output are also stored in plain text in the .tfstate, and can be retrieved later with terraform output.

One possible solution would be a mechanism to only show the secrets once, then not store them at all and not show them again (like AWS does), possibly using only-once output as I suggested in #4437

@revett
Copy link

revett commented Jan 14, 2016

+1

1 similar comment
@sstarcher
Copy link

+1

@Tbohunek
Copy link

Tbohunek commented Jul 27, 2021

Hi @jbardin I don't understand #28292 (comment).

Terraform cannot store only the hash of the value, because the original value is required for terraform's operation

Have you already described anywhere what does terraform use the original value for beyond comparison? Thanks.

@jbardin
Copy link
Member

jbardin commented Jul 27, 2021

Hi @Tbohunek,

If the attribute is referenced anywhere else in the configuration, terraform must preserve the original value in order to propagate it through the configuration. It must also be stored in order to satisfy the provider protocol, as providers require any stored value to be returned unchanged (see Terraform Resource Instance Change Lifecycle).

@Tbohunek
Copy link

Thanks @jbardin but I still don't get it. Would you mind sparing a few more minutes to give a specific example of where hash would not work?

In state there is no propagation. Each attribute value is stored explicitly.
In plan the current value is retrieved from the runtime variable and propagates from there.
That said, terraform should compare just the hash of each propagated value with the state-stored hash. Why would this not work?

@whiskeysierra
Copy link

In plan the current value is retrieved from the runtime variable and propagates from there.

What do you mean with runtime variable?

@Tbohunek
Copy link

@whiskeysierra I mean that when I terraform plan, the desired value of the (sensitive) attribute is present in secrets.auto.tfvars or the like.
So Terraform can propagate it downstream, and in each instance check its hash with hash stored in .tfstate.
If the provider is to return any value, it would be the new value because either the old value has changed and is not relevant, or is unchanged and remains untouched in state.

@jbardin
Copy link
Member

jbardin commented Jul 28, 2021

Would you mind sparing a few more minutes to give a specific example of where hash would not work?

A primary example is the provider protocol, where a stored value is required to be returned per the contract of the API. There may be options to extend the protocol in the future for other types of values and resources, but we must uphold the agreed upon API with existing providers.

In state there is no propagation. Each attribute value is stored explicitly.
In plan the current value is retrieved from the runtime variable and propagates from there.

Directly retrieving a variable at runtime to be used in an ephemeral manner during plan is only a subset of the problem. All configuration evaluation is based on the state. If a resource has stored a value which itself has marked as sensitive, and that value is referenced by another resource, it must be retrieved from the state in order to evaluate the reference. After initial creation, if that value is required for subsequent plans of any resources, we again need the original value to send to the provider.

@Tbohunek
Copy link

If a resource has stored a value which itself has marked as sensitive, and that value is referenced by another resource, it must be retrieved from the state in order to evaluate the reference.

Could this really not be done with the hash only? This evaluation should work the same provided it knows it's going to get a hash. Current implementation is really unlucky.

@mr-miles
Copy link

mr-miles commented Jul 28, 2021 via email

@spstarr
Copy link

spstarr commented Aug 12, 2021

At the risk of asking something simple

why not:

data "aws_ssm_parameter" "key" {
  skip_store_tfstate = "true/false" <-- or something
  name = "/path/to/some/key"
  with_decryption = "true"
}

So either on terraform apply/destroy it forces Terraform to query AWS again for the value every time. Then it's never stored in tfstate and we don't have this issue of having to use Vault which defeats the purpose of using SSM in the first place.....

Of course the caveat is that the SSM value isn't changed or if it has changed, may or may not break terraform destroy but it's acceptable risk to me, you could just fix the SSM key(s) in that case.

@Tbohunek
Copy link

@spstarr in order for Terraform to detect if the value has changed, it has to store some information about the value.

But it shouldn't store the value itself, it should be capable of storing just its hash.

@spstarr
Copy link

spstarr commented Aug 12, 2021

@Tbohunek but does it need to know if its changed or not if we've explicitly told it to ignore state? A hash is fine

@Tbohunek
Copy link

@spstarr it does, if it should avoid trying to alter the actual resource.
Imagine this value is VM Admin password. Do you want Terraform to reset the Admin password on every run? No? Then it value/hash needs to be stored in state.

@spstarr
Copy link

spstarr commented Aug 12, 2021

@Tbohunek well I don't want Terraform to reset or change if if I've told it this value must always be queried from AWS every single time no state. I'm not sure why this concept isn't simple to implement? a do_not_store_state_or_check_state just get the value as-is from AWS.

@acdha
Copy link

acdha commented Aug 12, 2021

@spstarr in order for Terraform to detect if the value has changed, it has to store some information about the value.

But it shouldn't store the value itself, it should be capable of storing just its hash.

It doesn't even need to do that: SSM Parameters and Secrets both have version numbers which increment every time the value is changed so it should be possible to treat this kind of like the way people use null resources for sequencing where it could avoid either decrypting the secret or triggering an update to the target resource unless the version number has changed since the last update. That would have the problem of not catching someone making a change to the target resource outside of Terraform without updating the corresponding Parameter/Secret so there could be an argument for that behavior conditional but that seems like a situation to strongly discourage.

@jbardin
Copy link
Member

jbardin commented Aug 12, 2021

@spstarr, @acdha, Storing only a hash or changing the behavior of existing data resources are not viable options here, but we do have a proposal for an ephemeral resource type, which would work roughly the way you describe. Since there are difficulties in supplying credentials to providers other than the storage aspect, more solutions to avoid the credentials in state at all may come out of the related issue #29182.

@frittentheke
Copy link

Encryption of the whole state file prior to storing it remotely is being discussed and worked on in #9556 or rather #28603

@dylanturn
Copy link

dylanturn commented May 6, 2022

Encryption of the whole state file prior to storing it remotely is being discussed and worked on in #9556 or rather #28603

And where/how do we store the state encryption keys?

@leosco
Copy link

leosco commented May 9, 2022

Encryption of the whole state file prior to storing it remotely is being discussed and worked on in #9556 or rather #28603

And where/how do we store the state encryption keys?

Those would have to managed outside source control via Vault or other secrets management product; ansible-vault has similar support for secrets and identity providers to encrypt sensitive data before checking into source

@Stavfilipps
Copy link

not an expert but would it be possible to encrypt those secrets on the client side with a key stored for instance in the aws secrets manager? Than decryption happens on terraform apply after fetching the decryption key.

@leosco
Copy link

leosco commented Jun 30, 2022 via email

@pj
Copy link

pj commented Jul 17, 2022

I'm pretty ignorant about how data types work in terraform, but would it be possible to have a "ProviderManagedVariable" interface? This would be an interface for a type that stores sensitive state externally, but contains information about how to access that state via a provider, basically like a pointer or reference. Only the non-sensitive information like the provider name, provider version, key name, last changed date etc would actually be stored in the terraform state.

This would require users of the type to be rewritten to grab the external variable when necessary, but not store it in their state. Also, every user of the data type would have to configure the correct provider to work.

@leosco
Copy link

leosco commented Jul 17, 2022 via email

@leosco
Copy link

leosco commented Oct 11, 2022 via email

@p4gs
Copy link

p4gs commented Jan 4, 2023

It's 2023. Are there any plans at all to make it so that Terraform has the ability to not expose plaintext credentials in .tfstate?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.