Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implicit locals or opt-in to new planning algo? #28573

Open
nl-brett-stime opened this issue Apr 30, 2021 · 5 comments
Open

Implicit locals or opt-in to new planning algo? #28573

nl-brett-stime opened this issue Apr 30, 2021 · 5 comments
Labels
enhancement new new issue not yet triaged

Comments

@nl-brett-stime
Copy link

nl-brett-stime commented Apr 30, 2021

Current Terraform Version

Terraform v0.14.2
+ provider registry.terraform.io/hashicorp/aws v3.37.0
+ provider registry.terraform.io/hashicorp/local v2.1.0

Use-cases

Allow inline resource references in data source blocks like aws_iam_policy_document (as back in version 0.12) without presenting spurious 'changes' in the plan output.

Proposal

Background

(@apparentlymart and @jbardin can probably skip down to proposal #1 and #2 ?)

Right now (in recent versions of TF) if we have a data source block that references another resource, we need to first declare the reference as a local and then reference the local. If we reference the resource attribute directly, Terraform detects 'changes' that aren't due to a mismatch between the configuration and the live/target environment. The differences seem to be due more to the differences between unspecified values and their effective defaults.

E.g., given an attribute like effect in a statement of an aws_iam_policy_document, the default value is Allow. If the actual, created/existing/managed policy already has effect: Allow, Terraform will detect a change if the resource declaration leaves effect unspecified (which should default to Allow and match the managed resource).

Proposal #1: implicit locals

Assuming the local is used to cache/memo-ize the result of having created the dependency resource in the TF state, would it be possible for Terraform to infer an implicit local even though one hasn't been explicitly written into the config?

E.g., given a something like jbardin's example, but without the explicit workaround:

resource "aws_s3_bucket" "b" {
  bucket_prefix = "terraform-0-14-test-"
  acl           = "private"
  server_side_encryption_configuration {
    rule {
      apply_server_side_encryption_by_default {
        sse_algorithm = "aws:kms"
      }
    }
  }
}

data "aws_iam_policy_document" "b" {
  statement {
    effect    = "Deny"
    actions   = ["s3:GetObject"]
    resources = ["${aws_s3_bucket.b.arn}/*"]
  }
}

resource "aws_iam_policy" "policy" {
  name        = "terraform_014_test"
  description = "My test policy"
  policy      = data.aws_iam_policy_document.b.json
}

...perhaps Terraform could automatically turn that into something like the following (e.g., in it's AST as some sort of pre-compile/pre-processor step)?

resource "aws_s3_bucket" "b" {
  bucket_prefix = "terraform-0-14-test-"
  acl           = "private"
  server_side_encryption_configuration {
    rule {
      apply_server_side_encryption_by_default {
        sse_algorithm = "aws:kms"
      }
    }
  }
}

implicit_locals {
  aws_s3_bucket__b__arn = aws_s3_bucket.b.arn
}

data "aws_iam_policy_document" "b" {
  statement {
    effect    = "Deny"
    actions   = ["s3:GetObject"]
    resources = ["${implicit_local.aws_s3_bucket__b__arn}/*"]
  }
}

resource "aws_iam_policy" "policy" {
  name        = "terraform_014_test"
  description = "My test policy"
  policy      = data.aws_iam_policy_document.b.json
}

If there are conflicts with the names of existing locals, perhaps it could just increment and append a serial number until the conflict is resolved e.g., aws_s3_bucket__b__arn__1, aws_s3_bucket__b__arn__2, aws_s3_bucket__b__arn__3 ?

Proposal #2: Opt-in to new resolution mode

Some of the other comments in the references section below mention not being able to implement a cleaner solution due to existing configurations expecting some behavior or other.

Would it be possible to do the cleaner implementation if configuration authors could opt in to the new behavior? E.g., within a module we could add:

terraform {
  should_enable_planning_v2 = true
}

References

@nl-brett-stime nl-brett-stime added enhancement new new issue not yet triaged labels Apr 30, 2021
@nl-brett-stime nl-brett-stime changed the title Implicit locals or opt-in? Implicit locals or opt-in to new planning algo? Apr 30, 2021
@apparentlymart
Copy link
Member

Hi @nl-brett-stime,

If I'm understanding correctly the situation you are describing, I think what you are saying is that you want Terraform to read from aws_iam_policy_document.b during the planning step even though it has a dependency on aws_s3_bucket.b, which has an action that won't occur until the apply step.

This behavior is in order to properly respect the dependencies you've declared, whereas earlier versions of Terraform failed to respect them in any situation where the configuration happened to be fully known during planning.

As you saw in @jbardin's comment over in #27171, the use of an indirection through a local value is already intended as a way to obtain the old behavior even though it was technically incorrect. Indeed, if we were designing this from scratch today we wouldn't even have included that workaround, because it doesn't really make sense that indirectly depending on a managed resource should behave differently than directly depending on it, but that seemed a viable compromise to take in order to keep both behaviors available while making the correct behavior the new default.

I would not expect us to ever make the old behavior the default, because it's a behavior that fails to honor the dependency graph and is therefore incorrect. Given that, I think what remains for this issue here is to discuss whether there's a more intuitive way to specify an opt-in to the old behavior, by which I mean declaring explicitly that you want to use a possibly-stale value from a managed resource during planning, even if a change to the managed resource is planned for the apply step.

@nl-brett-stime
Copy link
Author

@apparentlymart Thanks for your reply!

I could see how something like the following might take a great deal of doing if it's not already present in the framework. However, it seems like it should be possible...

In the given scenario, as is pretty common for policy documents at least, we need the ARN of the depended-upon resource. The action won't be taken until apply-time but it seems like we could know, at plan-time, which actions would lead to a change of ARNs and which will be update-in-place or no-ops. If we expect the ARN to change, I think it's good to alert/notify that the dependent policy will change as well, even if the plan can only show a placeholder indicating that the eventual value will be unknown until that part of the dependency graph is resolved at apply-time. If the planned action over the depended-upon resource won't cause it's ARN to change, then it's safe to compare the refreshed/actual depender resource value with the 'stale' value from the depended-upon resource.

Let me know if I can try to clarify that more. Otherwise, I'm not overly familiar with the code-base so it's entirely possible I'm the one who needs further clarification. I don't have extensive experience with go-lang but I do have a working copy of the repo and have already stepped through some parts in a debugger. If you point out some files that are relevant to this discussion, I should be able to at least review them and get a greater appreciation of the challenges.

@jbardin
Copy link
Member

jbardin commented May 3, 2021

Hi @nl-brett-stime,

Thanks for the feedback. From what I can tell here, the idea of "implicit locals" appears to be another way to declare what we have already, with the same drawback of it requiring the users to know to add this new argument to the configuration.

The present behavior you see aligns with the original intent in terraform which has been present in the codebase for a very long time. The problem is that the original implementation was not correct, and in order to fix the multitude of bugs it caused and produce a consistent behavior, we had to make changes that are seen as breaking existing config which grew around the inconsistent behavior. So what you are referring to as the "new resolution mode" is actually the original mode with the bugs fixed.

I personally would prefer not to have the current behavior, because it adds a special case for this combination of data+managed resource pair, and each exception added to the configuration language adds to the cognitive overhead of using the tool. In fact, when the planning code for data sources was rewritten, these "implicit references" (as they were called in #25961) were not created and the planning worked as you expected.

What we found in testing however is that a lot of existing configuration, provider tests, and examples relied on key aspects of the original behavior, and were broken by this proposed new implementation. We could continue and break all these configurations and declare that they had to add depends_on within each data source, but this was deem unacceptable because no breaking configuration changes were planned, and transitioning from a configuration where depends_on doesn't work for data sources to one where it was mandatory added more complication for users. The other choice was to re-implement the old behavior to allow existing configuration to continue working, with the drawback of having this exception to the dependency handling even though this exception appears to be what users desired in most cases, which brings us here.

I would still like to find a non-intrusive way to alert users about why these changes are triggered, so that if the changes are not expected a solution is more easily discoverable. Warning about the references in general is not acceptable, since the behavior is often desired so there is nothing technically incorrect to warn about. I think this would best be done within the plan itself, by adding a reason for the data source being planned rather than being read. Not only would this help indicate what is triggering the change, but it also help users with more complex configurations where a data being deferred causes unexpected changes much further downstream in the dependency graph.

@nl-brett-stime
Copy link
Author

@jbardin

From what I can tell here, the idea of "implicit locals" appears to be another way to declare what we have already, with the same drawback of it requiring the users to know to add this new argument to the configuration.

Just to clarify: My description above has two different proposals for mitigating the issue.

The first proposal is basically to have Terraform automatically add any locals that are only needed to work around the issue. That way, the old configuration syntax would continue working the same way (but hopefully satisfy the new logic).

The second proposal (which would probably be done instead of the first if it's unworkable) is something like, "Let configuration authors choose when to make the switch from the old syntax to the new syntax" E.g., once they've added the appropriate depends_on attributes to the data sources in their modules, they could add one more bit to say, "this module is now forwards compatible instead of backwards compatible". Maybe the presence of a depends_on attribute itself could be the signal that triggers the new logic.

@jbardin
Copy link
Member

jbardin commented May 4, 2021

Hi @nl-brett-stime,

The behavior we have now has been solidified over two major releases, with module authors already having made any changes that may have been required to accommodate updated versions of terraform. What this effectively means is that changing the default behavior at this stage in development would require a very high bar to put in place. If we were hypothetically going to make a change in any way to the default behavior, we could just go back to he original idea of using depends_on to ensure the dependency is with the entire resource.

There is also no major algorithmic changes to switch on and off in the planning logic (the "old" behavior no longer exists, as there is no longer a separate refresh phase in terraform), we simply trigger the depends_on behavior in the case presented.

Since changing the default behavior is unlikely, I think the best course of action is to make the default behavior easier to understand, which I think we can do with more details in the plan output. This would have the added benefit of making similar and more complex cases also easier to understand for users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement new new issue not yet triaged
Projects
None yet
Development

No branches or pull requests

3 participants