Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

terraform_data support sensitive value flag #32789

Open
Xboarder56 opened this issue Mar 7, 2023 · 23 comments
Open

terraform_data support sensitive value flag #32789

Xboarder56 opened this issue Mar 7, 2023 · 23 comments

Comments

@Xboarder56
Copy link

Terraform Version

1.4.0-rc1

Use Cases

Support sensitive=true for terraform_data objects.

While testing the new terraform_data object I noticed it’s returning sensitive data in the output field but correctly identifying sensitive data in the input value.

I’m currently testing the terraform_data resource to store api credentials in a map to pass into another resource.

Attempted Solutions

No work around for hiding the sensitive output thus far.

Proposal

Support sensitive=true in terraform_data resources.

References

No response

@Xboarder56 Xboarder56 added enhancement new new issue not yet triaged labels Mar 7, 2023
@Xboarder56
Copy link
Author

@jbardin I noticed the PR for the terraform_data object and attempted a few tests with it and ran into this issue.

@apparentlymart
Copy link
Member

Hi @Xboarder56! Thanks for sharing this feature request.

So we can understand the motivation for your request, it would help if you could share an example resource block showing what you were attempting to do. Although we may be able to improve terraform_data to meet your need, we might also be able to suggest a different way to solve your problem that doesn't involve any resources at all, since you mentioned your goal is to just store data in a map.


Regarding the current behavior of terraform_data: based on its implementation I would've expected sensitive values to transfer directly from input to output without losing their "sensitivity", because the provider literally just copies the entire value from input into output during its apply phase:

if !req.PlannedState.GetAttr("output").IsKnown() {
newState["output"] = req.PlannedState.GetAttr("input")
}

Therefore I'd suggest that we start by understanding why that isn't working. I don't think we should really need a separate argument for sensitive here because this provider should be able to just preserve the sensitivity of the input, and therefore the "sensitivity flag" would be to just assign a sensitive value to input, without any need to set anything else.

@Xboarder56
Copy link
Author

Xboarder56 commented Mar 7, 2023

@apparentlymart sure thing. When terraform_data runs it gets the input value and the plan shows input as sensitive value but on the output it displays the entire value. I included a partial screenshot of the plan.

The reason for the terraform_data object is so I can track changes on it and then trigger a lifecycle replace when the value changes.

image

resource "xsoar_integration_instance" "example" {
  name               = "example"
  integration_name   = "example"
  propagation_labels = ["all"]
  config = {}

  provisioner "local-exec" {
    working_dir = "./scripts"
    interpreter = ["python3"]
    command     = "./update_integration.py"
    environment = {
      API_KEY                   = var.xsoar_api_key
      XSOAR_HOST                = var.host
      INTEGRATION_INSTANCE_NAME = "example"
      INTEGRATION_IGNORE        = true
      INTEGRATION_CONFIG        = terraform_data.example_config.output
    }
  }

  lifecycle {
    replace_triggered_by = [
      terraform_data.example_config.output
    ]
  }
}

resource "terraform_data" "example_config" {
  input = jsonencode([{
    name = "credentials",
    value = {
      password = data.vault_kv_secret_v2.secrets["example"].data["api"]
    }
    hasvalue = true
    },
    {
      name     = "ip_relationships"
      value    = []
      hasvalue = true
    },
    {
      name     = "url_relationships"
      value    = []
      hasvalue = true
    },
    {
      name     = "file_relationships"
      value    = []
      hasvalue = true
  }])
}

@jbardin
Copy link
Member

jbardin commented Mar 7, 2023

Hi @Xboarder56,

What you see here is the intended behavior, because that is how all providers and resources work. A provider does not see any of the sensitive marks from core, and the returned value can only have the sensitive marks present in that resource's schema. Since the schema must be static and loaded before any planning operations can proceed, there is no way to change the resulting value within the current protocol.

Being part of the internal terraform provider, terraform_data does bypass the grpc layer which gives us a little more freedom, but because the provider protocol was never designed to handle dynamically sensitive values, there is a lot of very detailed handling of the sensitive marks when calling a provider which would make special-casing the terraform_data resource quite challenging. Adding sensitive marks into the protocol is technically possible, but that would immediately put the responsibility for correctly handling all derived sensitive values on the providers which they are not equipped to do.

Having terraform_data behave differently also has its drawbacks, in that it introduces inconsistency in the resource behavior which can lead to user confusion or simply more bugs.

That leaves us with no good solution currently, but this is definitely something we need to look into in the future.

Thanks!

@jbardin
Copy link
Member

jbardin commented Mar 7, 2023

I should also add that in most cases if you're only using terraform_data to store values you can use triggers_replace, and since that attribute is solely derived from the configuration, sensitivity is tracked only from the core side.

@Xboarder56
Copy link
Author

Xboarder56 commented Mar 7, 2023

@jbardin does triggers_replace return the value as an output when input isn’t provided. I was glancing over the notes in your PR documentation request and don’t see how to access those values stored in triggers_replace.

I was thinking as an alternative I could put the terraform_data into its own module with an outputs.tf of output and add sensitive to the output itself that way. That was kind of my last resort solution to work around the issue.

Thanks again for both of your support on the issue with it being a release candidate feature.

@jbardin
Copy link
Member

jbardin commented Mar 7, 2023

@Xboarder56, Can you explain what you are trying to do where you need the output value? You can reference triggers_replace just like any other attribute, but if your goal is to use replace_triggered_by, just reference the entire resource.

Running the output through a module output defeats the purpose of using terraform_data as a target for replace_triggered_by, because you lose the resource lifecycle with which you can trigger replacements.

@Xboarder56
Copy link
Author

@jbardin The goal is to have a local-exec provisioner run every time INTEGRATION_CONFIG (an environment variable for local-exec) is changed/updated. That's why I stored the value of INTEGRATION_CONFIG into a terraform_data block to have as a resource that a lifecycle replace_trigger_by would run redeploying the resource and running the local-exec provisioner on the new config data.

@apparentlymart
Copy link
Member

Relying on the fact that the terraform.io/builtin/terraform provider is built in and requests don't have to navigate the wire protocol is exactly the sort of thing I had in mind as a low-key way to address this, and I expect it would work because I remember we successfully prototyped terraform_remote_state doing a similar thing at one point (dynamically marking sensitive output values as sensitive) but ended up having to roll that back because it was effectively a breaking change to suddenly mark a bunch of things sensitive that weren't before.

However, it's evident from the observed behavior that just passing through sensitive values through this resource type doesn't work today, so I assume that either something changed in the meantime, or perhaps Terraform is treating managed resource types differently than data resource types since our previous experiment with this only tried it with a data resource type.

I think it would still be worth understanding what exactly is preventing this from working today, even if we decide that we want to retain that behavior for consistency with the external providers that use the wire protocol. Do you already have a sense of that @jbardin, or would figuring that out require some more research?

@jbardin
Copy link
Member

jbardin commented Mar 7, 2023

@apparentlymart, all config and state values are unmarked prior to calling any provider method because the protocol cannot serialized marks. Perhaps stripping and re-applying marks could be pushed out into the grpc wrappers (not sure if that was ever considered in the initial implementation), but there is a lot of special handling of marks in core which need to be accounted for, like conditionally calling methods if only marks have changes or if values changed but marks haven't, etc. That does at least sound possible though, and once the internal interface technically accepts marks making the exception within the provider's own code would be possible.

@apparentlymart
Copy link
Member

Thanks! I suppose we must've at one point had a little hole in that which was letting sensitive values come back from data resources, since that was specifically what we were doing with terraform_remote_state before rolling that back. That's of course different than handling sensitive values being passed in to the provider as arguments; I guess we don't have any code to always unmark anything coming back from a provider because in the normal case it just simply isn't possible for a sensitive value to traverse the wire protocol so there would typically be nothing there to unmark.

@jbardin
Copy link
Member

jbardin commented Mar 7, 2023

@Xboarder56 what I was describing would look more like below. There is no reason to actually pass the value through input/output here, you are only concerned with triggering replacement based on a change in the config value.

resource "xsoar_integration_instance" "example" {
...
  lifecycle {
    replace_triggered_by = [
      terraform_data.example_config
    ]
  }
}

resource "terraform_data" "example_config" {
  triggers_replace = local.itegration_config
}


locals {
  integration_config = jsonencode([{
    name = "credentials",
    value = {
      password = data.vault_kv_secret_v2.secrets["example"].data["api"]
    }
    hasvalue = true
    },
    {
      name     = "ip_relationships"
      value    = []
      hasvalue = true
    },
    {
      name     = "url_relationships"
      value    = []
      hasvalue = true
    },
    {
      name     = "file_relationships"
      value    = []
      hasvalue = true
  }])
}

@Xboarder56
Copy link
Author

@jbardin just confirmed your solution works perfect.

Would you like me to close this out because as mentioned terraform_data is working as expected in regards to secrets being visible in the outputs?

@jbardin
Copy link
Member

jbardin commented Mar 7, 2023

Thanks @Xboarder56, I think we can leave this here for now, to take a closer look if there is any possibility of adding this type of value handling to the resource.

@jbardin jbardin added provider/terraform and removed new new issue not yet triaged labels Mar 7, 2023
@Miroka96
Copy link

I am interested in this feature as well because I use the terraform_data resource together with a remote-exec provisioner that is triggered on destroy. Terraform tells me that provisioners at destroy-time can only access the object itself. Therefore, I store all connection information, including the SSH private key, in the input of the terraform_data resource.
On resource destruction, this prints the whole output including the SSH private key and I don't know how to solve it differently.

@mateusmedeiros
Copy link

I understand there's probably plenty of implementation details and/or design considerations that could make this not worth implementing (like for example the aforementioned fact that the provider protocol wasn't designed to handle dynamically sensitivie values), but in my opinion this is the most intuitive behavior.

We already have terraform_data being a special case and with its input/output interface it makes sense intuitively that the output can be marked as sensitive or not, just like a first-class output. That purely from a user's perspective, of course.

Since we usually have workarounds or other ways to achieve the same results, it makes a lot of sense to me as a developer that the implementation being non-trivial this should not be implemented and instead maybe just documented with a heads-up note in the docs or something like that, even though the way I see it it's the most intuitive.

Just my two cents' worth.

@jbardin
Copy link
Member

jbardin commented Jul 20, 2023

Note that terraform_data is not a special case right now, it follows the same internal protocol as any other managed resource. Making it a special case would be unprecedented, and probably would only be addresses by a larger change to the protocol. Whether that's something to undertake would need some research, since passing sensitivity information through the protocol does little good if it can't be enforced and ends up ignored by most providers.

@mateusmedeiros
Copy link

mateusmedeiros commented Jul 23, 2023

Note that terraform_data is not a special case right now, it follows the same internal protocol as any other managed resource. Making it a special case would be unprecedented, and probably would only be addresses by a larger change to the protocol. Whether that's something to undertake would need some research, since passing sensitivity information through the protocol does little good if it can't be enforced and ends up ignored by most providers.

Sorry, I meant it's a special case in concept as a resource/provider, not in implementation. It usually serves the purpose of allowing some tapping into the lifecycle without having to have an actual resource associated underneath, so it's exceptional in that sense, all other resource blocks from other providers will have actual resources underneath of some kind.

What I tried to say can be summarized as something along the lines of "I understand if because of implementation details or other design considerations this isn't a good idea and would probably agree if that's the case, but if it's worth anything, for the end-user I believe the sensitive attribute makes sense in the case of the terraform_data resource.".

@OJFord
Copy link
Contributor

OJFord commented Oct 27, 2023

An easy, safe, solution could just be that output is marked sensitive?

Then, in the case when it really isn't (i.e. there's nothing sensitive in the input), user can just use nonsensitive(terraform_data.foo.output.bar) as desired.

@kitos9112
Copy link

together with a remote-exec provisioner that is triggered on destroy. Terraform tells me that provisioners at destroy-time can only access the object itself. Therefore, I store all connection information, including the SSH private key, in the input of the terraform_data resource.

A similar use case here. I'm leveraging a destroy-only provisioner and cannot pass local or other values except self, inputs or replaced_triggers. A bit of an unfortunate event, I cannot think of any workaround at the moment that would make this use case work.

@egorshulga
Copy link

Let me express another proposal (just out of brainstorming): would it be possible to introduce a new property input_sensitive together with an attribute output_sensitive. Of course, it should be made explicit in the documentation, that if one wants to use some secrets in terraform_data then they should use the new input_sensitive and not the existing input

@robson90
Copy link

Just from a developer perspective, If my input is sensitive, the oputput has to be too. For me this is inconsistent:
Bildschirmfoto 2024-04-22 um 14 46 24

@cveld
Copy link

cveld commented Apr 29, 2024

Using terraform 1.8.1 I am getting:

╷
│ Error: Provider produced inconsistent final plan
│
│ When expanding the plan for terraform_data.kubernetes to include new values learned so far during apply, provider   
│ "terraform.io/builtin/terraform" produced an invalid new value for .input: inconsistent values for sensitive        
│ attribute.
│
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
╵

Using the following config:

resource "terraform_data" "kubernetes" {
  depends_on = [azurerm_role_assignment.aks_admin]
  input      = azurerm_kubernetes_cluster.default
}

This happens when the terraform_data resource already exists upfront and the azurerm_kubernetes_cluster.default is changed.

It seems related to the sensitive value flag so I thought I bring this up in this issue.

Just upgraded to 1.8.2. Let's see if it still happens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants