Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storing sensitive values in state files #516

Open
seanherron opened this issue Oct 28, 2014 · 145 comments
Open

Storing sensitive values in state files #516

seanherron opened this issue Oct 28, 2014 · 145 comments

Comments

@seanherron
Copy link
Contributor

@seanherron seanherron commented Oct 28, 2014

#309 was the first change in Terraform that I could find that moved to store sensitive values in state files, in this case the password value for Amazon RDS. This was a bit of a surprise for me, as previously I've been sharing our state files publicly. I can't do that now, and feel pretty nervous about the idea of storing state files in version control at all (and definitely can't put them on github or anything).

If Terraform is going to store secrets, then some sort of field-level encryption should be built in as well. In the meantime, I'm going to change things around to use https://github.com/AGWA/git-crypt on sensitive files in my repos.

@bitglue
Copy link

@bitglue bitglue commented Jan 28, 2015

See #874. I changed the RDS provider to store an SHA1 hash of the password.

That said, I'm not sure I'd agree that it's Terraform's responsibility to protect data in the state file. Things other than passwords can be sensitive: for example if I had a security group restricting SSH access to a particular set of hosts, I wouldn't want the world to know which IP they need to spoof to gain access. The state file can be protected orthogonally: you can not put it on github, you can put it in a private repo, you can use git-crypt, etc.

@kubek2k
Copy link
Contributor

@kubek2k kubek2k commented Jan 28, 2015

related #689

@dentarg
Copy link

@dentarg dentarg commented Mar 17, 2015

Just want to give my opinion on this topic.

I do think Terraform should address this issue. I think it will increase the usefulness and ease of use of Terraform.

Some examples from other projects: Ansible has vaults, and on Travis CI you can encrypt informaton in the .travis.yml file.

@ketzacoatl
Copy link
Contributor

@ketzacoatl ketzacoatl commented Mar 28, 2015

Ansible vaults is a feature I often want in other devops tools. Protecting these details is not as easy as protecting the state file.. what about using consul or Atlas as a remote/backend store?

+1 on this

@dayer4b
Copy link
Contributor

@dayer4b dayer4b commented May 28, 2015

I just want to point out that, according to official documentation, storing the state file in version control is a best practice:

https://www.terraform.io/intro/getting-started/build.html

Terraform also put some state into the terraform.tfstate file by default. This state file is extremely
important; it maps various resource metadata to actual resource IDs so that Terraform knows what
it is managing. This file must be saved and distributed to anyone who might run Terraform. We
recommend simply putting it into version control
, since it generally isn't too large.

(emphasis added)

Which means we really shouldn't have to worry about secrets popping up in there...

@hobbeswalsh
Copy link

@hobbeswalsh hobbeswalsh commented Jun 18, 2015

👍 on this idea -- it would be enough for our case to allow configuration of server-side encryption for S3 buckets. Any thoughts on implementing that?

@apparentlymart
Copy link
Member

@apparentlymart apparentlymart commented Sep 11, 2015

At the risk of adding scope to this discussion, I think another way to think of this is that Terraform's current architecture is based on a faulty assumption: Terraform assumes that all provider configuration is sensitive and that all resource configuration isn't sensitive. That is wrong in both directions:

  • Several resources now take passwords as inputs or produce secret values as outputs. In this issue we see the RDS password as one example. The potential Vault provider discussed in #2221 is another example.
  • Several provider arguments are explicitly not sensitive, such as the AWS region name, and excluding them from the Terraform state results in Terraform having an incomplete picture of the world: it can see that there is an EC2 instance with the id i-12345 but it can't see what region that instance is in without help of the configuration. Changing the region on the AWS provider causes Terraform to lose track of all of the existing resources, because as far as the AWS provider is concerned they've all been apparently deleted.

So all of this is to say that I think overloading the provider/resource separation as a secret/non-secret separation is not the best design. Instead, it'd be nice to have a mechanism on both sides to distinguish between things that should live in the state and things that should not, so that e.g. generated secrets can be passed into provisioners but not retained in the state, and that the state can encode that a particular instance belongs to a particular AWS region and respond in a better way when the region changes.

There are of course a number of tricky cases in making this situation, which I'd love to explore some more. Here are some to start:

  • If you don't retain something in the state then it's not safe to interpolate it anywhere because future runs will assume they can interpolate attributes from existing resources in the state.
  • Some provider config changes effectively switch all resources to an entirely new "namespace", and thus effectively force every attached resource to be destroyed and recreated in the new region. The AWS region is one example, since AWS resources are region-specific. But that's not so simple for other arguments: the AWS access_key might change what Terraform has permission to interact with, but it doesn't change the id namespace that resources live within.
@little-arhat
Copy link

@little-arhat little-arhat commented Sep 18, 2015

Hi, any progress on that? Terraform 0.6.3 still stores raw passwords in the state file. Also, as a related issue, if you do not want to keep passwords in configuration, you can create variable without default value. But, this will force you to pass this variable every time you run plan/apply, even if you're not going to change resource that has this password.

I think, it would be nice to separate sensitive stuff from other attributes, so it will:

  • be stored as sha1 or smth in state file
  • not require value if it already has one.

So, for configuration like:

variable db {
    password {}
}

resource ... {
    password = "${var.db.password}"
}

terraform will require variable for the first run, when it doesn't have anything, but will not require on subsequent runs.

To change such value one need to provide different value for password.

@EvanKrall
Copy link
Contributor

@EvanKrall EvanKrall commented Oct 16, 2015

Maybe there's a simple solution: store the state in Vault?

@mwarkentin
Copy link
Contributor

@mwarkentin mwarkentin commented Oct 19, 2015

A good solution for this would be useful for us as well - we're manually configuring certain things to keep them out of the tfstate file in the meantime.

@ascendantlogic
Copy link

@ascendantlogic ascendantlogic commented Oct 30, 2015

So as I slowly cobble together another clean-sheet infra with Terraform I see this problem still exists, and this issue is almost exactly 1 year old. What is the thinking in regards to solving this? the ability to mark specific attributes within a resource as sensitive and storing SHA1 or SHA2 hashes of their values in the state for comparison? I see this comment on a related ticket, does that mean that using Vault will be the prescribed way? I get that it promotes product synergy but I'd really like a quick-n-dirty hashing solution as a fallback option if I'm honest.

@ketzacoatl
Copy link
Contributor

@ketzacoatl ketzacoatl commented Oct 30, 2015

Moving secrets to vault, and using consul-template or integration with other custom solutions you have for CM certainly helps for a lot of cases, but completely avoiding secrets in TF or ending up in TF state is not always reasonable.

@ascendantlogic
Copy link

@ascendantlogic ascendantlogic commented Oct 30, 2015

Sure, in this particular case I don't want to manually manage RDS but I don't want the PW in the state in cleartext, regardless of where I keep it. I'm sure this is a somewhat common issue. Maybe an overarching ability to mark arbitrary attributes as sensitive is shooting for the moon but a good start would be anything that is obviously sensitive, such as passwords.

@jfuechsl
Copy link

@jfuechsl jfuechsl commented Nov 26, 2015

Would it be feasible to open up state handling to plugins?
The standard could be to store it in files, like it is currently done.
Other options could be Vault, S3, Atlas, etc.

That way this issue can be dealt with appropriately based on the use-case.

@brikis98
Copy link
Contributor

@brikis98 brikis98 commented Dec 12, 2015

I just got tripped up by this as well, as the docs explicitly tell you to store .tfstate files in version control, which is problematic if passwords and other secrets end up in the .tfstate files. At the bare minimum, the docs should be updated with a massive warning about this. Beyond that, there seem to be a few options:

  1. Offer some way to mark variables as secret and either ensure they never get stored in .tfstate files or store them in a hashed form.
  2. Encrypt the entire .tfstate file.
  3. Remove the recommendation to store .tfstate files in version control and only recommend them to be stored in secure, preferably encrypted storage.
@ejoubaud
Copy link

@ejoubaud ejoubaud commented Jan 5, 2016

One thing to consider around this is output. When you create a resource with secrets (key pair, access keys, db password, etc.), you likely want to show the secret in question at least once (possibly in the stdout of the first run, as output do)

Currently output are also stored in plain text in the .tfstate, and can be retrieved later with terraform output.

One possible solution would be a mechanism to only show the secrets once, then not store them at all and not show them again (like AWS does), possibly using only-once output as I suggested in #4437

@revett
Copy link

@revett revett commented Jan 14, 2016

+1

1 similar comment
@sstarcher
Copy link

@sstarcher sstarcher commented Jan 19, 2016

+1

@daveadams
Copy link
Contributor

@daveadams daveadams commented Mar 20, 2020

One use case that I think is being overlooked here is state sharing by different teams. For example, we have an S3 bucket in which we store state for thousands of TF projects maintained by dozens of teams. Some of those projects fetch the state of other projects from that same bucket, and use outputs of one project as input values.

But, we don't want secrets stored in all the projects to be readable by every teams. We might want them to be able to look up an RDS DB's DNS name, but not its root password, for example. And we don't want to have to worry about if state gets copied locally for troubleshooting or for any number of other reasons.

This is not achievable by an all-or-nothing encryption scheme where access to anything is access to everything. Remote execution doesn't solve this problem. We would still want a barrier between different projects and teams. Executing an apply on project 1 that needs some tidbit of output from project 2 shouldn't imply that project 1 can read all of project 2's secrets, whether the apply is being run by a human or by a CI system.

@pecigonzalo
Copy link

@pecigonzalo pecigonzalo commented Mar 23, 2020

But, we don't want secrets stored in all the projects to be readable by every teams

You can achieve this by storing the output somewhere else, like SSM even instead of querying the state. This is a known pattern used in many projects. Check https://github.com/cloudposse for example.

Remote execution doesn't solve this problem.

Why? How is this different to how it works on CF?

@daveadams
Copy link
Contributor

@daveadams daveadams commented Mar 23, 2020

So @pecigonzalo, yes, if we added the extra complexity of a three new layers of complexity around using Terraform (a second external storage system for "safe" shared state, plus the wrappers to implement it, plus an external system for running Terraform), we could achieve state safety. It's true that you can solve pretty much any software problem with wrappers and indirection.

But this ticket is not asking for ways to work around Terraform's limitations. We are hoping to get a feature built into Terraform that will avoid the need for all that extra complexity, and make secure state easier to achieve in a lot of scenarios that are different than yours.

@pecigonzalo
Copy link

@pecigonzalo pecigonzalo commented Mar 23, 2020

@daveadams
Using the state across teams as an interface is IMO a miss-use which can even bite you in the future as now your terraform state is locked as the interface between 2 teams. This is like 2 separate services talking to the same database.
The state is primarily the state of terraform and not an access controller between teams, if you want to achieve proper access control, you most likely are going to need a tool that is intended to do that.

Terraform must store state about your managed infrastructure and configuration. This state is used by Terraform to map real world resources to your configuration, keep track of metadata, and to improve performance for large infrastructures.

In any case, you have not answer how this is not solved with remote state and using remote execution, given the individual users no longer would have access to state files.

@pecigonzalo
Copy link

@pecigonzalo pecigonzalo commented Mar 23, 2020

@pecigonzalo
Copy link

@pecigonzalo pecigonzalo commented Mar 23, 2020

Without a native provider you could do something like (pseudocode):

Write:

locals {
  sharedoutput = {
    address = "asda"
    someother = "thisthat"
  }
}

resource "aws_s3_bucket_object" "examplebucket_object" {
  key                    = "someobject"
  bucket                 = "somebucket"
  content                 = "${jsonencode(local.sharedoutput)}"
  server_side_encryption = "AES256"
}

Read:

data "aws_s3_bucket_object" "bucket_ouput" {
  bucket = "somebucket"
  key    = "someobject"
}

locals {
  bucketinput = "${jsondecode(data.aws_s3_bucket_object.bucket_ouput.body)}"
}

Which should work similarly to retrieving state, but allows you to customize what to put there or how to share it. Its somewhat similar to my SSM suggestion, maybe a bit simpler given its a single file.

If this is provided as a native resource to get/set then it should be quite straight forward. Altho, this is quite simple to do to be honest.

@daveadams
Copy link
Contributor

@daveadams daveadams commented Mar 23, 2020

For remote execution, if you can refer to other remote state in your project, you can (intentionally or accidentally) leak secrets from it.

@daveadams
Copy link
Contributor

@daveadams daveadams commented Mar 23, 2020

I'm honestly confused why you are pushing back so hard on this @pecigonzalo. We all understand there are workarounds. We are all using workarounds already, which is why this feature is being requested.

@pecigonzalo
Copy link

@pecigonzalo pecigonzalo commented Mar 23, 2020

Because I believe a lot of the urgency pushed here is unfounded. People compare this to CF when on CF they run remotely, without complains apparently, but here they want to run locally and not have access to the secrets.

Many of the use-cases, like the one about output sharing you described, are not really a big problem or not a problem of terraform state management and do not belong on this issue or are not a problem at all, as explained and shown in my example.

To add, how is my split S3 output example a lot more complex than just sharing state? Its 3 lines of code and fairly clear and decoupled from terraform itself.
IMO sharing state files is a lot more complex and a bad practice if applied anywhere, because you cannot mutate the interface or change buckets, or config without affecting downstream users. While in my example you can easily mutate and evolve the interface.
Its as I said before, as a bad practice as the example of 2 separate services accessing the same database.

@pecigonzalo
Copy link

@pecigonzalo pecigonzalo commented Mar 23, 2020

Just another clarification, I am in favor of the feature, not because it prevents user access when running locally or because we need to share outputs, but to avoid secrets being copied outside of their sources when used in combination with lets say, Vault or other similar tools.
Which is a much smaller problem than mixing all this things together.

@acdha
Copy link

@acdha acdha commented Mar 23, 2020

@pecigonzalo I think it would be helpful if you were more careful to respect that not everyone works the same way on the same projects with the same security footing as you, and that's perfectly valid. The thread is full of valid use-cases and given that this is a security consideration, I would especially suggest considering how the default behaviour being less secure could shape an organization's view of Terraform as they're just getting started, especially given that this does not happen to anyone using CDK which is secure by default:

CDK RDS:

By default, the master password will be generated and stored in AWS Secrets Manager.

My suggestion back in October to have Terraform use Secret versions for the state record and only retrieve the value right before performing the actual API call to create/update would not be a substantial change, would allow lower-privilege plans, would work both locally and remote with any backend, and would allow the documentation for popular services like RDS to have a clear, simple “Always use {Secrets,SSM} to store the password” message and a runtime warning/error if you did anything else. Given how many times I've seen this crop up on different projects, that seems like a meaningful real-world security win even if it doesn't solve everything.

@pecigonzalo
Copy link

@pecigonzalo pecigonzalo commented Mar 23, 2020

@acdha I do respect that is the case, that is why Im trying to point those users in what I believe is right direction. I do not agree the thread is full of valid use-cases, many of them are indeed trying to do something that they should be doing some other way.
This is not a minor change, its quite complex and because of that we should limit the scope and be clear on what we need and want to archive from this.

You suggestion in October is a good approach, but the main reason that works is because as mentioned multiple times, CF runs remotely, we dont actually know if somewhere in a private state CF stores a sha or something to be able to tell if the secret changed.
While it I agree it would be good to have a simpler way to reference this resources, you can do this today, by using SSM data resources, and not applying locally.

@acdha
Copy link

@acdha acdha commented Mar 23, 2020

You suggestion in October is a good approach, but the main reason that works is because as mentioned multiple times, CF runs remotely, we dont actually know if somewhere in a private state CF stores a sha or something to be able to tell if the secret changed.

Why can't Terraform store the version number of the SSM Parameter or Secret and trigger an update any time the specified / queried (in the AWSLATEST case) changes? Both services use immutable versions so it's guaranteed to change any time the value changes.

@pecigonzalo
Copy link

@pecigonzalo pecigonzalo commented Mar 23, 2020

Yeah I already said I think its a good approach, but you need something that translates to other providers as well and it also needs the user to query the parameter on plan, so you still need access to the secret.

@jakubigla
Copy link

@jakubigla jakubigla commented Mar 28, 2020

Why can't we have something similar to lifecycle -> ignore_changes like ignore_state_file_values?

If the user specifies a key, the value won't be stored and terraform plan will not be checking this key for changes. Just like lifecycle -> ignore_changes. This could be done on a provider level too.

@bgshacklett
Copy link

@bgshacklett bgshacklett commented Mar 29, 2020

What is being asked here is for local runs to not be able to access those secrets, which is more than complicated given you want your local machine to evaluate and compare state to be able to apply a plan, but to do so without access to secrets, which are part of what it might need to change or evaluate.

Logically speaking (tech debt and refactoring aside), I see two ways around this:

  1. Remove the evaluation of secrets from Terraform's scope; just pass them through. Update the secret every time unless the user configures the lifecycle rules to ignore changes on that value. In the long term, perhaps a method for hinting to Terraform that the value should be updated could be passed in from the outside.
  2. Ensure that secrets are hashed prior to any persistence or outputs other than resource configuration. If it's truly necessary for Terraform to know whether a value has changed, it's just as simple to compare two hashes as it is to compare the value of two secrets.

Either of these would need to be opt-in, obviously, but that's a fairly standard pattern now, and Terraform already has a reasonably complex type system which should be able to handle additional metadata. Providers would require updates, but this might be doable in a largely backwards compatible manner, resulting in those updates being limited to cases where a resource's value might be a secret.

On another note, the question of local or remote runs continues to be brought up. Aside from auditability, I don't see how it matters from a security perspective. @acdha made a very good point, previously:

Think about the time scale and who has access: under the S3 method, anyone who ever gets access to that bucket or any system where Terraform has been used has a copy of all of the credentials.

This is absolutely accurate. The state file doesn't need to be stored as a copy locally, because the system has full access to the bucket. Even if we assume that Jenkins, CodeBuild, Azure Automation, ad nauseum, is running the builds and is running in an environment with no direct access via SSH or other methods, it's still quite easy for anyone who has capabilities to run Terraform jobs to pull down a copy of that state file from S3. The whole point of Jenkins, or any of these CI tools is to run commands. Actually, let me rephrase that: "the main purpose of a system like Jenkins is to execute arbitrary code on behalf of the end-user!"

Running in an automated system does not make you more secure. At best, it makes things repeatable (the original goal of CI/CD) and more visible, so that exploits are, hopefully, seen and tracked. At worst, it gives you a false sense of security leading to unintended data exfiltration.

@zerkms
Copy link

@zerkms zerkms commented Mar 29, 2020

Ensure that secrets are hashed prior to any persistence or outputs other than resource configuration. If it's truly necessary for Terraform to know whether a value has changed, it's just as simple to compare two hashes as it is to compare the value of two secrets.

@bgshacklett not all values can be obtained on the second run. Eg: aws iam access key secret.

So it won't work.

@daveadams
Copy link
Contributor

@daveadams daveadams commented Mar 29, 2020

It's worth pointing out that if the infrastructure is in place in the Terraform code to provide a means of hashing potentially secret values, then you'd have all the pieces you'd need to make per-field encryption via Transit or PGP or KMS or whatever. Encrypting and decrypting values in any of those systems is very easy to do. The hard part here is that Terraform would need to provide a common mechanism for doing so, rather than the very rare one-off provider resource that has some resource-specific methods, eg aws_iam_access_key.

@pecigonzalo
Copy link

@pecigonzalo pecigonzalo commented Mar 30, 2020

Why can't we have something similar to lifecycle -> ignore_changes like ignore_state_file_values?

I would be good but I dont think its that simple, because if I remember correctly, ignore_changes still stores the value as it required to resolve values for other dependent resources, it will just stop applying and storing subsequent changes. Check #15797 as well for more info on this topic.

Remove the evaluation of secrets from Terraform's scope; just pass them through. Update the secret every time unless the user configures the lifecycle rules to ignore changes on that value. In the long term, perhaps a method for hinting to Terraform that the value should be updated could be passed in from the outside.

Wouldnt that mean you need access to the secret every time anyway?

Ensure that secrets are hashed prior to any persistence or outputs other than resource configuration. If it's truly necessary for Terraform to know whether a value has changed, it's just as simple to compare two hashes as it is to compare the value of two secrets.

  • How does this work if I add a new resource afterwards that uses that secret? Terraform needs to know how to "unhash" it so it can send it to the resource.
  • To apply/plan you needs to retrieve the secrets on the runner, so if its local, you need access to the secret anyway.

On another note, the question of local or remote runs continues to be brought up. Aside from auditability, I don't see how it matters from a security perspective.

It matters because the users no longer have direct access to the secrets, and it keeps being brought up because people keep comparing local Terraform runs to remote CloudFormation runs, which is not comparable.

Think about the time scale and who has access: under the S3 method, anyone who ever gets access to that bucket or any system where Terraform has been used has a copy of all of the credentials.

This is absolutely accurate.

I do no think it is, systems that ran Terraform do not store a local copy of your state when using remote state. It just keeps this for example.

{
    "version": 3,
    "serial": 1,
    "lineage": "UUID",
    "backend": {
        "type": "s3",
        "config": {
            "bucket": "BUCKET",
            "key": "KEY",
            "region": "REGION"
        },
        "hash": HASH
    },
    "modules": [
        {
            "path": [
                "root"
            ],
            "outputs": {},
            "resources": {},
            "depends_on": []
        }
    ]
}

The "modules", etc, section remains empty.

You can also limit access fairly simply, by using different KMS keys or buckets, who access which states like you would do for secrets. The only problem here would be if you are using remote state to populate input to other terraform things. Which I think would be solved by my previous example (which could indeed be made simpler with a native resource to bundle that logic across providers)

Running in an automated system does not make you more secure. At best, it makes things repeatable (the original goal of CI/CD) and more visible, so that exploits are, hopefully, seen and tracked. At worst, it gives you a false sense of security leading to unintended data exfiltration.

I disagree, it depends quite a lot on the implementation, its not about an automation system or not, it is about restricting the systems that have access to the secrets and how they can access those secrets, otherwise your blank statement would apply to all your systems that have access to secure content, like your systems using said secrets.

If your main concern are internal intentional hacking, then IMO allowing people to run arbitrary code on your CI/CD or whatever system with escalated privileges is your main problem in any case, not Terraform in particular. The automation system already has access to create and modify resources (in order to run terraform apply) as well as passwords, etc. so it has access to retrieve those values or change them to allow access, or even destroy them in many cases.

Many Automation systems already have protections for this kind of situations, like preventing certain commands, or only allowing to pass in parameters and using the pre-defined run configuration only.
Just in case, this is the same case with CloudFormation, anyone that can push cloudformation could modify it to store the secret somewhere or print it out and then retrieve it.

It's worth pointing out that if the infrastructure is in place in the Terraform code to provide a means of hashing potentially secret values, then you'd have all the pieces you'd need to make per-field encryption via Transit or PGP or KMS or whatever.

But what do you gain here? If you still can decrypt the secrets, then is no difference.


The best solution for most resources is IMO outside of terraform and is to have the actual resources be able to retrieve from pointers to secrets (like ECS does with SSM/SecretManager or RDS with SecretManager, Kubernetes with many providers, etc) this ensures no tools has access except the secret storage and resource.

There are other critical cases like aws_iam_access_key or tls_private_key which are a bit more complicated, as it has to store or present the output somewhere.
aws_iam_access_key has pgp_key for encryption of the output, but its still stored in the state.

I think it would be good to list the cases that are a "priority" or a problem as not all resources or sensitive content is equal, so we can go over them and evaluate possible fixes.
For example:

  • Access outputs of another terraform state (as done with data.terraform_remote_state)
    This is a problem because cross-team access to outputs require cross-team access to state which could be really risky given the secrets in the state. While we can use what I showed before, its not intuitive and straight forward. Simply splitting state and output would greatly improve the workflows across teams.
@bgshacklett
Copy link

@bgshacklett bgshacklett commented Mar 30, 2020

@zerkms In that particular case, comparison of the current and previous values is impossible anyway, so what we're really looking for is interpolation of the value into another resource. I suggest that, on an opt-in basis, this functionality be removed for specific values, as it is 100% reliant on storing the value in the state file, which is the very thing we're looking to avoid in this issue. It's an "eat your cake and have it, too" situation.

Ideally, some middle ground could eventually be found, such as allowing a one-time interpolation of the value while the data remains in memory. This would allow for the creation of a secret in the user's preferred secrets management system which could be referenced at a later time, taking it out of Terraform's scope.

@daveadams You're suggesting reversible encryption via KMS or other mechanism? I don't believe this goes far enough. I will acknowledge that the granularity of access control would be much better, allowing for referencing of other teams' state files without being able to access their secrets. It would not prevent a user who has privileges to run Terraform from being able to retrieve application-level secrets after they're created, though. If Terraform has the capability to decrypt these secrets, then so does that user.

@zerkms
Copy link

@zerkms zerkms commented Mar 30, 2020

In that particular case, comparison of the current and previous values is impossible anyway

@bgshacklett it's not needed: in that particular case you need to store the secret value into the state.

My point was that hashing is not generic and does not work.

which is the very thing we're looking to avoid in this issue.

we don't, the state should be encrypted then this is not an issue anymore.

@gtmtech
Copy link

@gtmtech gtmtech commented Apr 13, 2020

Only having support for encrypting the whole statefile via a remote backend is not very good IMO. Firstly if you want encryption it forces you to use a remote backend that supports it, which is limiting the choice of users, (no i dont want to use terraform cloud) - and secondly its far too course-grained, not allowing scenarios such as Dave supplies above where different people have different concerns on a statefile.

Like a lot of terraform's "solutions" I find this one is geared to a not-very-big project where you dont have the kind of concerns that you do in large companies where granular security is important.

I had the same argument over at ansible with their next-to-useless ansible-vault which encrypts the entire vars file, not allowing meaningful diffs of changed elements. I had the same argument over in puppet years ago where their puppet-hiera-gpg encrypted an equivalent file and ended up joining forces with Tom Poulton to write puppet-hiera-eyaml which encrypts individual items instead. It wasnt perfect (using a single encryption key for starters), but people loved it at the time because it enabled diffs etc and could be built upon to provide multikey encryption). Hashicorp Vault is full of good ideas around encryption, I'm surprised they havent got more involved in helping terraform out, the integration between terraform and vault isn't as good as it needs to be (which is surprising for Hashicorp as a company, right?)

At least adding a pgp_key to an individual item on supported resources provided an attempted (but ugly) solution for not wanting to have sensitive values in the state whilst having different teams concerns addressed in the same statefile. Its a shame to see that being deprecated everywhere and not replaced with anything else. For me its the end user that should be able to decide if a particular item is sensitive or not, not because a developer has put a Sensitive flag in the code - and in the event the end user has decided its sensitive, they should be able to assert that (1) it should not be stored in the state at all, (2) should be stored but encrypted with a supplied key, (3) should be stored but encrypted via a pluggable API e.g. Vault. And finally (4) on top of all that the backend could provide additional encryption of the whole thing if necessary fulfilling the encryption-at-rest requirement if the user doesnt like other automatic encryption-at-rest that is given to them via e.g. s3 buckets.

People often confuse the many purposes of encryption - one concern (as in encrypting the whole thing) is for encryption-at-rest for compliance purposes, which is ONLY about people siphening information off of hard drives plucked from the computers they were running in. Its very important to have, but it does not tackle other purposes of encryption - encryption in transit, guarding information from people within your own and associated teams, in which case granular value-based encryption is also essential in a highly collaborative and large working environment which also needs to adhere to many internal informational standards such as ISO27001.

It would seem that terraform has done the first assuming it will solve the last. I'd like to see a multi-purpose approach which could cater for the others. I'm doing ridiculous workarounds such as setting a password to FOO with a dependent and triggered null_resource which resets it to BAR so BAR doesnt end up in the state. I'd like to see encryption as a first class supported candidate in terraform which has been well thought through rather than more workarounds.

@bernata
Copy link

@bernata bernata commented May 1, 2020

The scenario for iam secret access key is in my view a compelling reason to support encrypting sensitive values. An example problematic scenario today is:

  1. a dev in a team wants to run terraform to create production infrastructure
  2. The terraform run outputs iam secret access key for an iam user created in the terraform.
  3. The dev reads the credentials and writes them on the on premise server in the .aws/credentials file on an on premise server.
  4. The tfstate is stored in an encrypted s3 bucket with the same secret access key.

The dev should not really have access to these production credentials. Creating a pipeline to run the terraform would help by limiting access to the tfstate and the outputted credentials; but is a fairly big infrastructure requirement. Encrypting the outputted credentials to a public key means the on-premise servers would decrypt and write the .aws/credentials at startup w/o humans being able to see the secret. It also means the credentials could be checked into source control [encrypted] making deployment of the credentials to on premise servers safe.

Please consider handling this specific scenario and supporting multiple encryption schemes and not just pgp; I created a PR 13119 to support RSA OAEP encryption of secret access keys in the aws provider.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

You can’t perform that action at this time.