`depends_on` always triggers data source read #11806

queeno · 2017-02-09T01:50:35Z

Hi there,

Terraform Version

Terraform v0.8.4

Affected Resource(s)

data.template_file

If this issue appears to affect multiple resources, it may be an issue with Terraform's core, so please mention this.

Terraform Configuration Files

data "template_file" "hello" {
    template = "template"
    depends_on = ["null_resource.hello"]
}

resource "null_resource" "hello" {}

Debug Output

https://gist.github.com/queeno/f198c93760f7c60e5102e7fd5873ad1f

Expected Behavior

terraform plan shouldn't re-read the data source.

Actual Behavior

terraform plan re-reads the data source.

❯ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but
will not be persisted to local or remote state storage.

null_resource.hello: Refreshing state... (ID: 3983604783330107355)

The Terraform execution plan has been generated and is shown below.
Resources are shown in alphabetical order for quick scanning. Green resources
will be created (or destroyed and then created if an existing resource
exists), yellow resources are being changed in-place, and red resources
will be destroyed. Cyan entries are data sources to be read.

Note: You didn't specify an "-out" parameter to save this plan, so when
"apply" is called, Terraform can't guarantee this is what will execute.

<= data.template_file.hello
    rendered: "<computed>"
    template: "template"


Plan: 0 to add, 0 to change, 0 to destroy.

Steps to Reproduce

terraform plan

Important Factoids

When using depends_on in template_file, terraform plan always seems to re-read the data source. If the data source is used by an instance's user-data, terraform plans to change the instance's user-data. terraform apply, however, doesn't produce any change.

If depends_on is not used, then the data source is not re-read.

The text was updated successfully, but these errors were encountered:

apparentlymart · 2017-02-09T02:04:26Z

Hi @queeno! Thanks for this issue.

I seem to remember this being done intentionally to fix a bug a while ago. depends_on presents a rather-awkward situation for data sources because they are defined as being refreshed early when they have no computed data, but depends_on means Terraform can't tell what they might be depending on.

There is possibly something more refined we could do here to make this support more cases, but unfortunately I think right now this is working as designed and we don't really have a better strategy in mind.

queeno · 2017-02-09T02:14:39Z

Hi @apparentlymart

Thanks for your very quick response!

That makes sense and wouldn't be too worrying, however as I mentioned earlier, when using the data source in an instance's user-data, terraform plan constantly shows the user-data changing.

terraform apply doesn't do anything as expected.

If you could do anything to fix this behaviour, it would be hugely appreciated 👍 😄

Have a look:


<= module.tw_instance.data.template_file.cloud-config.0
    rendered:                "<computed>"
    template:                "(my template)"
    vars.%:                  "2"
    vars.etcd_discovery_url: "https://discovery.etcd.io/ddd"
    vars.region:             "europe-west-1"

<= module.tw_instance.data.template_file.cloud-config.1
    rendered:                "<computed>"
    template:                "(my template)"
    vars.%:                  "2"
    vars.etcd_discovery_url: "https://discovery.etcd.io/ddd"
    vars.region:             "europe-west-1"

<= module.tw_instance.data.template_file.cloud-config.2
    rendered:                "<computed>"
    template:                "(my template)"
    vars.%:                  "2"
    vars.etcd_discovery_url: "https://discovery.etcd.io/ddd"
    vars.region:             "europe-west-1"

~ module.tw_instance.google_compute_instance.myvm.0
    metadata.%:         "" => "<computed>"
    metadata.sshKeys:   "(my key)" => ""
    metadata.user-data: "(my template)" => ""

~ module.tw_instance.google_compute_instance.myvm.1
    metadata.%:         "" => "<computed>"
    metadata.sshKeys:   "(my key)" => ""
    metadata.user-data: "(my template)" => ""

~ module.tw_instance.google_compute_instance.myvm.2
    metadata.%:         "" => "<computed>"
    metadata.sshKeys:   "(my key)" => ""
    metadata.user-data: "(my template)" => ""

Plan: 0 to add, 3 to change, 0 to destroy.

apparentlymart · 2017-02-09T04:27:51Z

@queeno as a workaround I suggest that you add a triggers map you your null_resource with some fixed value inside and then interpolate it into an unused value in the vars block on the template_file.

That should then make Terraform see the dependency via the interpolation, allowing you to remove the depends_on and bypass this.

That is assuming that wasn't just a contrived example for the bug report... If it was, I'm sure it's possible to adapt this workaround to whatever resources you are really using... just interpolate anything from the resource you want to depend on into a template variable. Most resources have a reasonable id attribute you can interpolate to achieve this.

queeno · 2017-02-09T10:57:06Z

Hi @apparentlymart

Great advice! I have followed your suggestion, removed the explicit dependency and had it all working as intended. I hope this issue can also help others in my same situation, while you fix the actual bug in terraform.

Yes, sorry it was a contrived example for the bug report. This is the actual code I am working on:

https://github.com/queeno/infra-problem/blob/master/terraform/modules/tw-instance/main.tf#L24

Thanks again for your help! 👍

mitchellh · 2017-02-10T18:22:30Z

Data sources are always refreshed during refresh. However, if the upstream things it depends on haven't changed or are available, we should not refresh the data source or show it in the plan. Definitely a bug in my view. Thanks.

jbardin · 2017-02-10T22:58:24Z

I agree. The change that @apparentlymart is referring to is #10670 which is only intended to prevent early evaluation when there is an explicit dependency.

I think this can be made to work with depends_on.

apparentlymart · 2017-02-10T23:42:11Z

Perhaps a suitable compromise would be this:

If a data block has a depends_on, ignore it during the refresh walk and then during the plan walk look up the nodes being depended on and generate the data source refresh diff only if at least one of them has any sort of diff in the plan so far.

Perhaps this is trickier than it seems though, if e.g. the dependencies are indirect via nodes that don't themselves generate diffs (modules, for example).

queeno · 2017-02-11T00:01:41Z

I'm just throwing it there to you, but why would the behaviour of an implicit dependency be different from an explicit dependency?

In the previous example, if i reference the null_resource implicitly, by adding a variable in the vars section of the data resource, this doesn't produce a refresh diff.

When I explicitly set the dependency, by using the depends_on attribute, then I'll see the refresh diff. Please notice here that the null_resource is never triggered and never changes.

apparentlymart · 2017-02-11T01:12:45Z

@queeno the problem is that with an interpolation Terraform can tell the difference between the value not being available yet and it being available in the state from a previous run. With depends_on it cannot, because there is no specific value check -- Terraform knows that something about the dependency affects the outcome of the datasource, but it can't tell what, so it just pessimistically assumes that we must always process the dependency resource first, because that's the safest and most conservative behavior.

My proposed compromise tries to get around that by using the presence of any diff on the dependency as a signal that the data source should be re-run, thus allowing us to mimic the convergent behavior of an interpolation-dependency where it'll trigger the read only when there's a create or update (of any attribute) on the things it depends on.

mitchellh · 2017-02-11T04:35:14Z

@apparentlymart It has to be present in the diff because we need it to be present in the diff for Apply to do anything to downstream nodes that may depend on it. If we don't put the data source in the diff then its outputs won't be computed which won't trigger downstream normal resources to be in the diff and so on...

johnrengelman · 2017-02-23T21:21:29Z

I'm also running into this currently because I was trying to use datasources to provide a module<->module dependency in a test framework. I'm setting up a test for a module which has some prereqs but I want it all deployed as 1 project in the test framework.

When I add a depends_on to a datasource then subsequent plans always show change to be made.

apparentlymart · 2018-01-04T20:11:10Z

Sorry for the long silence here, everyone!

We've been looking at this issue again recently, and I've written up #17034 as a proposal for one way to address it. This proposal builds on the discussion above, and attempts to also deal with some other similar quirks with implicit dependencies from data resources.

It'll take some more prototyping to see if that proposal is workable, since there are undoubtedly some subtleties that we didn't consider yet. We won't be able to work on this immediately due to other work in progress, but we do intend to address this.

swetli · 2018-12-10T13:41:47Z

I had a similar issue with data source local_file . In general I had null_resource which created file foo, and a data source that was reading the contents of this file in terraform. Whenever I added depends_on the local_file - it always triggered computed content of it. In order to workaround that I used this:

resource "null_resource" "create_file" {

}

data "null_data_source" "file" {
inputs = {
data = "${file("foo")}"
null = "${format(null_resource.create_file.id)}"
}
}

Later we can reference the data by: ${data.null_data_source.file.outputs["data"]}

The default service account data resource currently uses a depends_on flag added to prevent a race condition in terraform-google-modules#141 Due to the way that Terraform refreshes data resources, Terraform thinks that the data resource has changed when in actuality it hasn't: hashicorp/terraform#11806 (comment) By changing to use a null data resource that interpolates the default service account email, the data resource will only change when the project number does.

trebidav · 2019-04-11T14:05:44Z

Terraform v0.11.13

I am having the same issue as mentioned here. Unfortunately looks like there is no workaround for my use case at this moment. Or.. any ideas?

resource "aws_ecs_task_definition" "application" {
  ...
}

data "aws_ecs_task_definition" "application" {
  task_definition = "${aws_ecs_task_definition.application.family}"
  depends_on = ["aws_ecs_task_definition.application"]
}

resource "aws_ecs_service" "application" {
  task_definition = "${aws_ecs_task_definition.application.family}:${max("${aws_ecs_task_definition.application.revision}", "${data.aws_ecs_task_definition.application.revision}")}"
  ...
}

Plan:

 <= module.celery.data.aws_ecs_task_definition.application
      id:              <computed>
      family:          <computed>
      network_mode:    <computed>
      revision:        <computed>
      status:          <computed>
      task_definition: "task_name"
      task_role_arn:   <computed>

  ~ module.celery.aws_ecs_service.application
      task_definition: "task_name:5" => "${aws_ecs_task_definition.application.family}:${max(\"${aws_ecs_task_definition.application.revision}\", \"${data.aws_ecs_task_definition.application.revision}\")}"

Plan: 0 to add, 1 to change, 0 to destroy.

Output:

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

soumitmishra · 2019-04-16T14:52:52Z

+1

apparentlymart · 2019-04-16T15:00:03Z

@trebidav if you're reading that task definition immediately after creating it just to get the revision value from it, I'd suggest looking to see if that same attribute is exported from the aws_ecs_task_definition resource type, and if not to open a feature request for the AWS provider for it to be. There should rarely be any reason to both create something and then read it with a data source in the same module.

With that said, the depends_on is redundant in your configuration in any case. You can safely remove depends_on from the data block without changing behavior, because the reference in the task_definition argument already implies that same dependency.

To everyone else: leaving "+1" or 👍 comments here doesn't do anything except create noise for those who are watching this issue for updates. If you want to vote for this issue, please leave a 👍 reaction on the original comment on this issue (not this comment!), since then we can query that as an input for prioritization.

sanchetanparmar · 2019-08-21T11:27:49Z

I am having same issue with latest version. tried with null_resource as well but after null resource its stuck with data. could not find resource. My Code is here

masterjg · 2019-10-28T11:49:16Z

I do not know for sure if this is related but when I do not set depends_on in aws_network_interfaces datasource terraform doesn't find anything:

data "aws_network_interfaces" "lb" {
  filter {
    name = "description"
    values = [
      "ELB net/${aws_lb.ec2_service.name}/*"
    ]
  }
  filter {
    name = "vpc-id"
    values = [
      var.vpc_id
    ]
  }
  filter {
    name = "status"
    values = [
      "in-use"
    ]
  }
  filter {
    name = "attachment.status"
    values = [
      "attached"
    ]
  }
}

Since I am referring to aws_lb.ec2_service.name it should automatically wait for aws_lb resource but it doesn't for some reason... However if I add depends_on it waits for resource but will trigger updates to dependent resources during each apply...

uhinze · 2020-01-22T09:11:59Z

Just stumbled over this and just wanted to share my (hacky but not overwhelmingly complex) solution.

I basically take the ID of the null_resource and put it into some attribute in the data provider, reducing it to length 0 so it doesn't affect the actual attribute value. Like so:

resource "null_resource" "foo" {}
data "kubernetes_secret" "bar" {
  metadata {
    name      = "baz${replace(null_resource.foo.id, "/.*/", "")}"
  }
}

This should work with any data provider I think.

evgenibi · 2020-03-10T10:39:12Z

Same issue here,

Trying to get the aws_route_tables data with a specific filter and it just reads it before creating the resources resulting in:

The "count" value depends on resource attributes that cannot be determined
until apply, so Terraform cannot predict how many instances will be created.
To work around this, use the -target argument to first apply only the
resources that the count depends

hakro · 2020-04-22T15:26:10Z

I have a similar issue too.
I need to create an AWS Secret, then trigger a lambda invocation using data.aws_lambda_invocation.my_lambda.

But the Lambda needs to be invoked only when the secret has been created, so I added depends_on = [aws_secretsmanager_secret_version.my_secret]
Otherwise the lambda invocation would fail, since it would reference a secret that doesn't exist yet.
But adding the 'depends_on' show on the plan :

Even though there is nothing to create, change, or destroy

hfgbarrigas · 2020-05-08T10:47:09Z

Stumbled on something similar to this one unfortunately.

resource "time_rotating" "rsa1" {
  rotation_minutes = 1
}

resource "null_resource" "create-local-file" {
  //the file name will be the rotation timestamp
  triggers = {
    timestamp = time_rotating.rsa1.unix
  }
}

data "template_file" "loca-file-contents" {
  template = file("${path.module}/${time_rotating.rsa1.unix}")
  vars = {
    id = null_resource.rsa1.id
  }
}

Basically, I want to create a local a file with a fixed rotation.
On the first run everything goes smoothly because the file gets created, but, on subsequent runs, the file doesn't get created because the rotation period has not been reached thus the datasource for the file content shouldn't be considered due to dependencies having no changes correct?

FIRST RUN:

SECOND RUN (with file function and not removing the files):

THIRD RUN (with file function and removing the files):

It appears that although dependencies have no changes terraform still refreshes the datasource that depends on a file that will never exist until there are changes.

apparentlymart · 2020-05-12T21:11:54Z

Hi @hfgbarrigas,

The behavior you saw there is as intended, because the file function is for reading files that exist statically as part of the configuration, not for files that are generated dynamically during a Terraform run. Terraform reads the file proactively during initial configuration decoding so that it can use the result as part of static validation.

Although I'd recommend avoiding generating local files on disk if you can, in unusual situations where you can't avoid it you can read the contents of a file using [the local_file data source] instead, which (because it's a data source, rather than an intrinsic function) will take its action during the graph walk, not during initial configuration loading.

resource "time_rotating" "rsa1" {
  rotation_minutes = 1
}

resource "null_resource" "create_template_file" {
  triggers = {
    filename = time_rotating.rsa1.unix
  }
  provisioner "local-exec" {
    # I added this because I assume you're intending to run
    # a local command to generate this file. You can refer
    # to self.triggers.filename in the provisioner configuration
    # to get the filename populated above.
  }
}

data "local_file" "generated_template" {
  filename = "${path.module}/${null_resource.create_template_file.triggers.filename}"
}

data "template_file" "result" {
  template = data.local_file.generated_template.content
  vars = {
    id = null_resource.rsa1.id
  }
}

If you have any follow-up questions about that, please feel free to create an issue in the community forum and I can follow up with you there.

hfgbarrigas · 2020-05-16T10:06:10Z

Hi @apparentlymart , you're right regarding the function, completely missed that documentation ... Although, I had tried the snippet you shared and had the same issue when it comes to refresh state and local files. Terraform seems to need the local file when refreshing state even though dependencies have no changes. Is this intended? Here's a snippet to reproduce.

data "local_file" "generated_template" {
  filename = "${path.module}/${null_resource.file.triggers.timestamp}"
}

data "template_file" "result" {
  template = data.local_file.generated_template.content
  vars = {
    id = null_resource.file.id
  }
}

resource "null_resource" "file" {
  provisioner "local-exec" {
    command = "echo test > $PATH/$NAME"
    environment = {
      PATH = path.module
      NAME = self.triggers.timestamp
    }
  }
  triggers = {
    timestamp = time_rotating.test.unix
  }
}

resource "time_rotating" "test" {
  rotation_minutes = 1
}

output "test" {
  value = data.template_file.result.rendered
}

First apply is ok, second apply with file still ok, third apply without file present is not ok. All of these applies were ran under a minute to avoid the time rotation.

antoniogomezalvarado · 2020-05-24T10:11:13Z

I know this is closed but trying out my luck here with all the great minds out there. I'm trying to set the DNS record of an EMR master instance once its finished creation with the following

data aws_instance "hive_emr_master" {

  count = "${aws_emr_cluster.hive_cluster.count == 1 ? 1 : 0}"

  filter {
    name   = "tag:aws:elasticmapreduce:instance-group-role"
    values = ["MASTER"]
  }

  filter {
    name   = "tag:Service"
    values = ["Hive"]
  }
  
  filter {
    name   = "tag:Env"
    values = ["${var.tier}"]
  }
}

resource "aws_route53_record" "hive_master_dns_record" {
  zone_id = "${var.resources["rds.hive.metastore.route53.zone.id"]}"
  name    = "SOME_NAME"
  type    = "A"
  ttl     = "300"
  records = ["${data.aws_instance.hive_emr_master.private_ip}"]

  depends_on = [
    "data.aws_instance.hive_emr_master",
    "aws_emr_cluster.hive_cluster"
  ]
}

However the depends_on as you've already encountered keeps changing the DNS record on every plan/apply. Is there a way to trigger the data block once the cluster has finished creating (not using depends_on obviously)? The above will work only for sequential runs, which means I have to trigger apply once again in order for this to work.

Thanks in advance for your time!

ghost · 2020-06-20T01:50:05Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

jbardin added bug core labels Feb 9, 2017

This was referenced Mar 2, 2017

external data source forces new resources on every terraform apply #12326

Closed

Don't display plan diffs with only data reads #12432

Closed

apparentlymart mentioned this issue Sep 8, 2017

plan -detailed-exitcode returns 2 with external provider even if there is no change #16055

Closed

mcanevet mentioned this issue Nov 10, 2017

Datasource rancher_certificate is not idempotent hashicorp/terraform-provider-rancher#54

Closed

This was referenced Nov 16, 2017

Terraform apply fails on subnet dependency #15413

Closed

Output variables populated from data.external data sources are never populated causing errors #16728

Closed

flosell mentioned this issue Dec 29, 2017

[WIP] New Resource: aws_acm_certificate (+ changes to data source to wait for certificate issuing) hashicorp/terraform-provider-aws#2801

Closed

4 tasks

perriea mentioned this issue Jan 25, 2018

Fixing version of terraform (0.10.X) and providers stefanprodan/k8s-scw-baremetal#2

Closed

alexng-canuck mentioned this issue Jan 31, 2018

Compute resources depending on custom image get destroyed+created on every apply oracle/terraform-provider-oci#410

Closed

lilithmooncohen mentioned this issue Jul 27, 2018

delete default compute engine service account stability fix terraform-google-modules/terraform-google-project-factory#3

Merged

bflad mentioned this issue Aug 14, 2018

Terraform thinks user-data is changing when it isn't, resulting in unnecessary resource replacement hashicorp/terraform-provider-aws#5011

Open

shirdoo mentioned this issue Aug 16, 2018

Config var data heroku/terraform-provider-heroku#109

Closed

emilymye mentioned this issue Aug 21, 2018

data.google_dns_managed_zone prompts an "read" update always hashicorp/terraform-provider-google#1902

Closed

apparentlymart mentioned this issue Dec 13, 2018

Aliased Data Source values not expanded during terraform plan. #19635

Closed

kayrus mentioned this issue Jan 6, 2019

Networking: add port data source terraform-provider-openstack/terraform-provider-openstack#567

Merged

heidemn mentioned this issue Mar 6, 2019

data.kubernetes_service should fetch values in "plan" phase rather than "apply" phase hashicorp/terraform-provider-kubernetes#353

Closed

tenpaiyomi mentioned this issue Jun 26, 2019

depends_on utilizing "${format}" vs direct reference behaves incorrectly #21890

Closed

hashibot mentioned this issue Aug 27, 2019

depends_on creates perma-diff for data resources #18600

Closed

hashibot added config v0.10 Issues (primarily bugs) reported against v0.10 releases v0.11 Issues (primarily bugs) reported against v0.11 releases v0.12 Issues (primarily bugs) reported against v0.12 releases v0.9 Issues (primarily bugs) reported against v0.9 releases labels Aug 29, 2019

dsantanu mentioned this issue Mar 9, 2020

aws_network_interfaces datasource cannot create dependency without the use of depends_on #24314

Closed

hfgbarrigas mentioned this issue May 8, 2020

Error refreshing datasource without dependencies change #24899

Closed

jbardin mentioned this issue May 13, 2020

Evaluate data sources in plan when necessary #24904

Merged

maartenvanderhoef mentioned this issue May 13, 2020

Fix issue #72 task_definition known after apply terraform-aws-modules/terraform-aws-atlantis#124

Closed

jbardin closed this as completed in #24904 May 20, 2020

danieldreier added this to the v0.13.0 milestone May 21, 2020

hashicorp locked and limited conversation to collaborators Jun 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`depends_on` always triggers data source read #11806

`depends_on` always triggers data source read #11806

queeno commented Feb 9, 2017 •

edited

apparentlymart commented Feb 9, 2017

queeno commented Feb 9, 2017 •

edited

apparentlymart commented Feb 9, 2017

queeno commented Feb 9, 2017 •

edited

mitchellh commented Feb 10, 2017

jbardin commented Feb 10, 2017

apparentlymart commented Feb 10, 2017

queeno commented Feb 11, 2017

apparentlymart commented Feb 11, 2017

mitchellh commented Feb 11, 2017

johnrengelman commented Feb 23, 2017

apparentlymart commented Jan 4, 2018

swetli commented Dec 10, 2018

trebidav commented Apr 11, 2019

soumitmishra commented Apr 16, 2019

apparentlymart commented Apr 16, 2019

sanchetanparmar commented Aug 21, 2019

masterjg commented Oct 28, 2019 •

edited

uhinze commented Jan 22, 2020

evgenibi commented Mar 10, 2020

hakro commented Apr 22, 2020

hfgbarrigas commented May 8, 2020 •

edited

apparentlymart commented May 12, 2020

hfgbarrigas commented May 16, 2020

antoniogomezalvarado commented May 24, 2020 •

edited

ghost commented Jun 20, 2020

depends_on always triggers data source read #11806

depends_on always triggers data source read #11806

Comments

queeno commented Feb 9, 2017 • edited

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Important Factoids

apparentlymart commented Feb 9, 2017

queeno commented Feb 9, 2017 • edited

apparentlymart commented Feb 9, 2017

queeno commented Feb 9, 2017 • edited

mitchellh commented Feb 10, 2017

jbardin commented Feb 10, 2017

apparentlymart commented Feb 10, 2017

queeno commented Feb 11, 2017

apparentlymart commented Feb 11, 2017

mitchellh commented Feb 11, 2017

johnrengelman commented Feb 23, 2017

apparentlymart commented Jan 4, 2018

swetli commented Dec 10, 2018

trebidav commented Apr 11, 2019

soumitmishra commented Apr 16, 2019

apparentlymart commented Apr 16, 2019

sanchetanparmar commented Aug 21, 2019

masterjg commented Oct 28, 2019 • edited

uhinze commented Jan 22, 2020

evgenibi commented Mar 10, 2020

hakro commented Apr 22, 2020

hfgbarrigas commented May 8, 2020 • edited

apparentlymart commented May 12, 2020

hfgbarrigas commented May 16, 2020

antoniogomezalvarado commented May 24, 2020 • edited

ghost commented Jun 20, 2020

`depends_on` always triggers data source read #11806

`depends_on` always triggers data source read #11806

queeno commented Feb 9, 2017 •

edited

queeno commented Feb 9, 2017 •

edited

queeno commented Feb 9, 2017 •

edited

masterjg commented Oct 28, 2019 •

edited

hfgbarrigas commented May 8, 2020 •

edited

antoniogomezalvarado commented May 24, 2020 •

edited