-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to not store certain attributes in TF state #30469
Comments
Hi @mbrancato! Thanks for sharing this use-case. A key part of meeting this request would be some way to ensure it doesn't cause Terraform to send invalid data to providers when performing the typical resource instance lifecycle actions that rely on previously-saved data: read, update, and delete. The current API contract is that a provider will receive exactly what it previously returned, and so providers are in principle allowed to rely on any attribute being preserved verbatim from the previous run. Just naively dropping any arbitrary attribute would therefore break this contract and thus cause providers to misbehave, potentially to the point of crashing altogether if the resulting object is something that the provider could never have generated itself and therefore not an input it's built to expect. This part of the problem could potentially be addressed by adding something new to the provider schema model to allow providers to nominate a particular subset of their attributes as "identifying", which would then make them ineligible to be excluded from the state to ensure that they'll always be preserved for future operations. Providers would then presumably not mark attributes likely to accept sensitive values as being identifying. A second part of the problem is that anything that's not included in the state must today not be available for references elsewhere, because references get populated from the state. In this case, that means that if you were to make Given all of this, I think a successful design to meet this use-case would need to address both of the following in detail:
|
As always great explanation @apparentlymart , thanks for that I would argue that selecting attributes still has a use case for pure data objects. With data it is assumed that you know what you are doing and you are querying resources that might have been created outside your Terraform state, by TF or other sources. To give you one example: say you would like to store data "aws_secretsmanager_secret_version" "super_secret" {
secret_id = data.aws_secretsmanager_secret.super_secret.id
} even if you only need I understand that whatever executes the TF has access to the object as a whole, still I think it would be an improvement to either exclude certain attributes from "data" or to have the ability to explicitly provide a list (could maybe default to all if not provided) |
@apparentlymart I might have missed something but I'll try to address your following points:
Thanks for your response on the issue and I hope this can be implemented in some kind of way in the future! |
I agree this would be nice. Especially with data objects it should be possible. Right now it happens that I want to use a data-object so I can reference certains attributes of this data object in my configuration, but I still want to keep other (sensitive) attributes from my state file. This is not possible right now. Even though i'm not using those specific attributes of the data object, they are still visible in my state file. My workaround would be to stop using data-objects and hardcoding certain things in my configurations, but the drawbacks regarding maintainability are way to large (possible breaking changes, the possibility that I introduce invalid configurations, etcetera) |
@apparentlymart Let me add another use case, one that I do not see any other way around at the moment: provider API keys. If you have any workarounds to offer, please do. A lot of providers require some kind of API key to authenticate to the manager of the resources they are trying to manage. For example, the Kubernetes provider requires credentials to authenticate to the Kubernetes cluster. In order to support Terraform in the cloud, where I often cannot safely store or even create files, I store the credentials in some secure storage such as AWS SSM Parameter Store. This is fine and secure, and I can then do something like provider "kubernetes" {
client_key = data.aws_ssm_parameter.kubernetes_client_key.value
...
} The problem with this, as we all know, is that everything about The simplest solution is to provide a way to prevent the value of Data Sources from being stored in the state. You could add a requirement that the data source not change its value between plan and apply. You could enforce this by saving a secure hash of the Data Source value (perhaps computed by For this subset of use cases, you could in fact require that the value not be referenced in resources and only be referenced in provider blocks. A more general solution would be to encrypt sensitive data in the Terraform state, but of course this begs the question of how access to the encryption/decryption keys would be managed and where would they be stored. I don't have a fully thought out solution to that, but my current leaning is toward using a public/private key pair, where writing to state would encrypt using the public key, and reading from state would decrypt using the private key. In this approach, all sensitive data would be encrypted, and, most importantly, it would not be an error to fail to decrypt the sensitive data. Instead, of the decryption key were not available, the sensitive data would be marked sensitive/unknown/tainted, and any operation requiring it would fail, but any other operation (such as The public key could be stored in the state itself, though it would require some kind of protection against being maliciously replaced. The private key would only be needed at apply time or when the sensitive value changed, as could be determined by storing the above mentioned hash of the value in the state file. I am very open to other ideas, but we need something that would work with Terraform Cloud and an S3 backend. The current situation seems fairly untenable. |
Our security team is telling me not to use Terraform for certain things, and this would let me stay on Terraform. The latest example for me is the What if we just didn't store the value in the state file if |
Another comment in favor of something like this - Our use case is similar but not related to security but more akin to preventing "dirty plans" from additional tracked state. We utilize AWS Organizations to create Organizational Units for billing and cost management reasons, but what OU any given AWS account is in is not handled by terraform. However, Terraform seems to want to keep a copy of the state of what Accounts are within an OU. So while the accounts within an OU are managed outside of Terraform, the So even for a noop terraform change, the plan my still contain "changes" just because Terraform is updating its state. Applying the Example
Note the "Note: Objects have changed outside of Terraform" and "No changes. Your infrastructure matches the configuration." lines. It would be nice to be able to explicitly exclude changes like this so that every |
We also have a similar need for this, where we have attributes that are not configured in the resource like "last_checkin" and "last_boot" which if changed have a similar output as stated above. If there's 50 of the same type of changes, even though the actual plan only requires one change, then the terraform plan looks messier than it needs to be. |
Why not be able to mark some data as sensitive and then have it hashed before saved to state? |
Just bumping this: We use TF to create our DB in RDS, and set its master password, and then store it into a secret in AWS Secrets Manager. We later use that secret in a data call in another module to create roles & users. A combination of ignore changes, and marking that secret to never be written to the state would be great. It could be written to the outputted plan, for passing to an apply, but that's it. It should never write back to the S3 state file. Plenty of people need to be able to read the state file, but we control the access to the secret much more tightly. |
Terraform state can contain secrets, this is well documented. But sometimes, there are situations where I absolutely just don't want Terraform to retain the secret value.
Current Terraform Version
Use-cases
Let's assume that TF is run as a service account which has the ability to create resources, and make some modifications to that resource. And let's assume that resources is a cloud database as well. The TF service account can create and set the initial user of the cloud database instance, but it doesn't immediately have access to the data in the database.
Again, let's assume the IAM permissions on the TF service account let it create the DB and initial user, but do not let it update the password for the DB.
What I would like is a way to allow TF to bootstrap creating the DB, setting the initial static admin username and password, passing those same credentials on to be stored in HashiCorp Vault. This could be static KV secrets or the database secret engine used to create dynamic credentials by policy. TF would read a random password from the Vault
sys/policies/password/*/generate
path, and use that to set a random password for the admin user. In the end, the TF state file would not have stored the values from the Vault data source, or the cloud database resource in the TF state.Note: this assumes the TF user knows that they need to ignore changes and otherwise compensate for the fact that the IAM permissions may not allow for updating the password and/or the random password would change on each TF run.
Attempted Solutions
I tried using
lifecycle.ignore_changes
- this works from a user experience perspective in that it will bootstrap the user and not change it on subsequent runs, but it still stores the actual password in the TF state.Proposal
A setting, similar to
sensitive_attributes
, that tells TF to store the empty / zero version of that attribute's type in the state and not the actual value used at runtime.Unlike
sensitive_attributes
, this would need to be configurable by the user, similar tolifecycle.ignore_changes
.This will cause config drift for user's who enable this and do not have a full grasp of where it is used.
The text was updated successfully, but these errors were encountered: