Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data source dependent on for_each/count resource doesn't wait for destroy before reading. #32179

Closed
Omicron7 opened this issue Nov 7, 2022 · 5 comments · Fixed by #32209
Closed
Assignees

Comments

@Omicron7
Copy link

Omicron7 commented Nov 7, 2022

Terraform Version

Terraform v1.3.4
on darwin_arm64
+ provider registry.terraform.io/hashicorp/aws v4.38.0

Terraform Configuration Files

# Data Source to read SSM parameters by path
data "aws_ssm_parameters_by_path" "parameters" {
  path = "/testing/path"
  depends_on = [
    aws_ssm_parameter.parameters,
  ]
}

variable "remove" {
  type    = bool
  default = false
}

locals {
  params = merge(
    {
      NAME = "value"
      FOO  = "bar"
    },
    var.remove ? {} : {
      REMOVE = "this"
    }
  )
}

# Resource to create/manage SSM parameters, Data source depends on this.
resource "aws_ssm_parameter" "parameters" {
  for_each  = local.params
  name      = "/testing/path/${each.key}"
  type      = "String"
  value     = each.value
  overwrite = true
}

output "parameters" {
  description = "Parameters"
  value       = join("\n", data.aws_ssm_parameters_by_path.parameters.names)
}

Debug Output

https://gist.github.com/Omicron7/d2c8abfdc4397b22ee8004921d8366bc

Expected Behavior

Terraform output and data_source should reflect changes made during the apply, including resources removed during the apply.

Actual Behavior

When removing a resource from the for_each, the data source that depends on it doesn't wait for the resource to be destroyed before reading from the source even though it the plan states (depends on a resource or a module with changes pending). This results in the data source and output containing the value that was destroyed after the apply completes. A subsequent plan/apply causes the data source to re-read and update to the correct values.

Steps to Reproduce

  1. terraform init

  2. terraform apply
    Initial apply to add resources and read data source that depends on it.

  3. terraform apply -var remove=true
    Apply to remove one of the resources from the for_each.

Additional Context

We are running into an issue where a data source that depends on a for_each resource isn't waiting for resources to be destroyed when changes occur.

In our case, terraform manages/controls some of the parameters in AWS SSM Parameter Store, and others are managed outside of terraform and read by a data source in terraform. The issue is when a parameter managed by terraform is removed from the for_each, the data source that reads the parameters doesn't wait for the parameter to be destroyed before reading, even though it depends on the parameters. This results in the output retaining the deleted key, and is only fixed by a subsequent plan/apply.

I verified that the resource graph is correct, and the data source does indeed depend on the resource for_each. It works correctly when a resource is added to the for_each, meaning the dependent data source waits for the create to succeed before reading, but not when destroying. It does work when parallelism is set to 1, but this isn't a good workaround.

I don't believe this is specific to the AWS SSM terraform resource, but is an issue with any data source dependent on a resource using a for_each/count. I just wasn't able to find another simple resource/data source to use as an example.

References

No response

@Omicron7 Omicron7 added bug new new issue not yet triaged labels Nov 7, 2022
@apparentlymart
Copy link
Member

Hi @Omicron7! Thanks for reporting this.

Unfortunately the behavior you've described here is working as designed, because when considering destroy actions Terraform understands "A depends on B" as meaning "B must outlive A" or "actions for A must happen before destroying B"; that is often the most useful interpretation of dependencies in this situation, because otherwise A would outlive B and therefore actions against A will typically fail.

I can see in your case though that you are explicitly trying to observe the the side-effect of deleting this object using a data source. We don't typically recommend having a single Terraform configuration both manage and read the same object, because it tends to lead to situations like this where data flow analysis can be insufficient to guess what order was intended. I would typically recommend designing this in a different way so that the information about which parameters are populated is coming from the same resource that is populating them, and therefore the side-effects are guaranteed to happen in the correct order to observe the result.

With that said, I can see that there is a bit of an "impedance mismatch" here in that the AWS provider offers a data source for querying everything under a prefix but doesn't have a matching resource type for managing everything under a prefix, so in this particular case I don't see a way to achieve what I recommended above.

The hashicorp/consul provider has a similar family of resource types for managing and reading items in Consul's key/value store, but that provider has a slightly different design that makes it possible to implement what I described above: consul_key_prefix managed resource type for managing everything under a prefix with a single resource, and consul_key_prefix data source for reading the corresponding same set of keys. This means that the configuration that is writing the data into Consul doesn't need to use the data source: the same information is available through the managed resource type and so we can avoid this problem of coordinating read and write side-effects across two different resources.

So with that said, I wonder if the hashicorp/aws provider team would consider offering a symmetrical aws_ssm_parameters_by_path managed resource type which manages all parameters under a given path prefix and exports the resulting map of data in the same way the data resource would, so that your parameters output could refer to the managed resource directly and thus ensure that read and write operations are always coordinated in a single action.

With the provider as it exists today I believe it would only be possible to get there by symmetrically using only the single-parameter resource types named aws_ssm_parameter (both managed resource and data resource of that type) so that the configuration which is managing this can avoid also reading them via a separate resource.

I think avoiding reading and managing the same object with two different resources in the same configuration is the most practical change here, but of course whether that will be possible will depend on what the provider team is willing to support. It isn't clear yet how we could change Terraform Core to support what you described, since the ordering Terraform Core currently uses is the more useful one in cases where dependencies are being used to represent ordering of actions against different objects, as opposed to ordering different actions against the same object (where Terraform can't tell that they are the same).

Thanks!

@apparentlymart apparentlymart added enhancement core working as designed confirmed as reported and closed because the behavior is intended and removed new new issue not yet triaged enhancement labels Nov 7, 2022
@Omicron7
Copy link
Author

Omicron7 commented Nov 7, 2022

@apparentlymart Thanks for the input. I agree, using a single/symmetrical resource or having Terraform manage all of the parameters would be the way to go. Unfortunately this was a very simplified example. In our actual infrastructure, multiple systems create parameters in SSM, and we look them up with an external data source using path and aws tags

For now, double plan/apply works, just seemed like it shouldn't be needed.

@jbardin
Copy link
Member

jbardin commented Nov 7, 2022

I'm fairly certain this is a bug in the ordering. The concept of "outliving" a dependency is the destroy order, but a data source read is an update action, and should depend on the deletion of the dependency (unless create_before_destroy was used). This creates an inconsistency in the ordering when compared with a dependency replacement, because during replacement the data source would wait for the new instance, which in turn waits for the destroy to complete. One of the principles of the dependency graph is that the absolute ordering of each action relative to any other remains the same, regardless of whether the the resources are being replaced, updated or deleted entirely.

The problem of using the data source to represent resources managed elsewhere in the configuration remains the same though -- and because that is not an intended or recommended use case is probably why this ordering issue has yet to be reported.

As for what exactly is going on here, only managed resources record their dependencies for destroy ordering, so there is probably an assumption somewhere in the code which misses the data source dependency when setting the destroy edges.

@jbardin
Copy link
Member

jbardin commented Nov 8, 2022

I confirmed my suspicion, data sources in this situation are overlooked. Data and managed resources share the same implementation at this level, but because data sources don't store their dependencies in the state (because data source state isn't really used), they end up being skipped when connecting the destroy edges.

@jbardin jbardin removed the working as designed confirmed as reported and closed because the behavior is intended label Nov 8, 2022
@jbardin jbardin self-assigned this Nov 11, 2022
@github-actions
Copy link

github-actions bot commented Jan 1, 2023

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants