Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

terraform: new plan graph #9973

Merged
merged 26 commits into from
Nov 9, 2016
Merged

terraform: new plan graph #9973

merged 26 commits into from
Nov 9, 2016

Conversation

mitchellh
Copy link
Contributor

Similar to the new apply/destroy graphs.

Experiment flag: -Xnew-apply (merging into that)

High level PR notes:

  • No tests changed, all old tests pass.
  • Shadowing the original graph for tests pass.
  • Shadow graph enabled for terraform plan
  • No existing transforms were changed, so the old-graph codepath of plan is very low risk to this change.

I originally wanted to wait until the new apply/destroy graphs were more stable, but as I looked at the core issues as well as the relative stability of the new apply/destroy graphs, I believe its more valuable to get this in for 0.8 beta cycle than to wait. There are many complex bugs that have to be fixed in the plan graph moreso than the apply graph.

An example is #9853: this issue never even cropped up with this new graph due to the simplicity of it (plans don't need to even care about CBD). The fix still is necessary since the new apply graph uses CBD, but at least the failure would've been smaller in scope.

There are other issues that I've been wanting to work on that are simply too complicated with the current all-in-one-graph.

Another huge benefit of this graph is that there is no more graph flattening. All complex graphs (plan, apply, destroy) no longer have module flattening. The graphs contain the modules from the getgo and this simplifies cycle-enablers greatly.

Just as with the other graphs, we enable shadowing on plan which compares the diffs. I want to get this into the 0.7.10 release so we can get more shadow graph errors with plans as we get going.

terraform: more specific resource references

terraform: outputs need to know about the new reference format

terraform: resources w/o a config still have a referencable name
Copy link
Member

@jbardin jbardin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Just a couple questions for my understanding, but nothing holding up merging this for further trials.

// on the count argument given.
//
// Orphans are found by comparing the count to what is found in the state.
// This tranform assumes that if an element in the state is within the count
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seriously, is the "tranform" typo the only thing fix you're going to throw me ;)

split[i] = s + ".destroy"
}

result = append(result, strings.Join(split, "/"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what's going on here? I'm not sure I understand what this reference format is doing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, definitely, I'm glad you caught this.

So, this is a lamentable example of where we're using strings when we should probably be using some sort of structured data. I actually want to look into that refactor shortly after this, but didn't want to merge the two since I think that'll end up being fairly significant.

In this case, the syntax A/B means "I depend on A, but if A is not available, I depend on B. If A is available, don't connect to B."

The use case for this is in expanded resources. There are three cases for a resource reference:

  1. aws_instance.foo.bar - Equivalent to aws_instance.foo.0.bar. See next.
  2. aws_instance.foo.0.bar, aws_instance.foo.5.bar, etc. (exact index) - In this case, you want to depend EXACTLY on a specific instance if it exists. However, if it doesn't exist (computed count), you want to depend on the thing that will expand into an exact instance. So, in this case, we set the depends on to be aws_instance.foo.0/aws_instance.foo.N where "N" is a special syntax used to denote "I will expand to more things at runtime."
  3. aws_instance.foo.*.bar (splat) - Straightforward, you depend on all aws_instance.foo instances. This creates a depends on of aws_instance.foo.* which every matching resource will advertise as something they can be depended on as.

The key thing is (2) above. The existing reference system didn't support this. THE WAY IT WORKED BEFORE:

  1. Everything would depend purely on the aws_instance.foo granularity (not count-specific)
  2. During count expansion (DynamicExpand in transform_resource.go, the old graph), we'd override DependsOn and DependableName to append the index if we have one. Then we'd reconnect.

This doesn't work in the new graph because some specific indexes (like orphans) are created outside the graph. I also wanted to lay the groundwork so that we can avoid cycles in computed counts if we know a specific index exists (in the state, for example). This isn't fixed yet with this new graph but we're heading in that direction and we'll have to anyways.

In conclusion, the right long term solution is a structure rather than a string. I patched on the / syntax since a resource name can't contain a / so its a safe character to use. However, I do want to take a look at this again to clean it up, but wanted to save that for another time since that refactor will touch a lot more than just plan-time stuff.

//
// Unlike ConfigTransformerOld, this transformer creates a graph with
// all resources including module resources, rather than creating module
// nodes that are then "flattened".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there no more graph "flattening", or the does this create the equivalent of the "flattened" graph?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes to both, but there is no flattening operation.

A history on flattening: Terraform used to treat modules as sub-graphs (that were never flattened) and when you "enter" the subgraph you'd built a new EvalContext and when you exited you'd throw it away. This was how modules worked in TF 0.3 (first version with modules I think).

We soon realized with bug reports that modules are not an isolated graph. There are complex ordering requirements that require entering/exiting the module boundary multiple times. The simplest example is:

  1. module input A created by root.foo resource
  2. module resource module.bar created with input A
  3. module output X created with a value from module.bar
  4. root.baz resource created with module output X
  5. module input B uses value from resource root.baz
  6. module resource module.qux uses input B

Notice how this requires in/out, in/out at least twice. Hence, the isolated subgraph didn't work. This was a serious serious bug that made modules basically unusable in Terraform 0.3. We needed a quick bolt-on solution to fix this in TF 0.3.1. That solution ended up being "well, we already have these correct subgraphs, what if we just flattened them in and added the necessary extra edges?" And so we did.

And graph flattening has been with us since.

But it is complicated. We knew even in 0.3.1 the right solution would've just been to put ALL resources in ONE graph from the beginning. There is no "flatten" step: you just have all resources with the proper address from the beginning.

This transform does that. Gone with flattening.

@mitchellh mitchellh merged commit 66ccc19 into master Nov 9, 2016
@mitchellh mitchellh deleted the f-new-plan branch November 9, 2016 16:15
@mrwacky42
Copy link
Contributor

Has anybody mentioned how annoying this is for people that are not Terraform developers ?
Almost every time I run terraform, I get pages of output from a shadow state mismatch.

Please, please please please, make "features" like this something I can disable at runtime next time.

I'd love to report bugs, and maybe even fix them, but the output doesn't even tell me what went wrong, other than * apply operation: Real and shadow states do not match!. I don't even know what bug I'd report, or how to sanitize/minimize my configuration to make a reasonable bug report.

I know, shame on me for compulsively upgrading Terraform to get access to bugfixes.

@mitchellh
Copy link
Contributor Author

Hey there! Yes we know it's annoying, but it's meant to be. It's far better to see this annoying message than to run a future Terraform version that does something horrible to your infrastructure.

There are a few additional notes here:

  1. This is a much larger change to Terraform than we generally do so the fact that there are more errors than we expected is normal. For the annoyance factor though: future changes shouldn't be as large so it should be rare to see.
  2. When we ship an experimental feature it has passed the point where all tests pass (unit and acceptance) and the only way we can know whether it's safe to ship is to get real users to use it. However, running experimental features against your real infrastructure is very risky, so we run them in "shadow mode" to lower risk.
  3. Once a sufficiently low number of shadow errors happen we begin shipping it in a beta (at time of writing 0.8 beta 1 is live). This has the new features ACTIVE but thanks to the shadow annoying errors we have high confidence it won't destroy your infrastructure unexpectedly :)
  4. As the messages say: the operation that ran DID succeed. So it's a lot of what I would call "noise pollution" but it doesn't actually impact the effect of your Terraform runs.

For folks who update often like yourself, you shouldn't be discouraged or shamed by it, that's not what we want. Any shadow errors we prioritize and fix exactly cause we know it's annoying. Every release thereafter has less and less.

And, to see the fruits of this labor: look at the 0.8 changelog! Those features there that I think you'll like or at least understand would've been far more difficult without the internal changes such as this PR which require shadowing.

In conclusion: it's annoying, and if you have ideas for improving it I'm open to them, but we want to ship stable software since this can affect real production infra.

Sent from my iPhone

On Nov 15, 2016, at 12:09 PM, Sharif Nassar notifications@github.com wrote:

Has anybody mentioned how annoying this is for people that are not Terraform developers ?
Almost every time I run terraform, I get pages of output from a shadow state mismatch.

Please, please please please, make "features" like this something I can disable at runtime next time.

I'd love to report bugs, and maybe even fix them, but the output doesn't even tell me what went wrong, other than * apply operation: Real and shadow states do not match!. I don't even know what bug I'd report, or how to sanitize/minimize my configuration to make a reasonable bug report.

I know, shame on me for compulsively upgrading Terraform to get access to bugfixes.


You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub, or mute the thread.

@mrwacky42
Copy link
Contributor

@mitchellh - Thanks for the speedy and lengthy reply. I understand what and why you're doing it, but I'm just asking for a way to opt-out. Did I say thank you for Terraform? I use it every day, a lot. My life would be significantly less pleasant if I had to use CloudFormation as much as I use Terraform.

My only "idea" is...

If I'm a good tester/citizen/member of the community, we can presume that when I see a bug, I report the bug (and provide a patch) and move on with my life. At this point, I don't need to see the many screenfuls of messaging. I've done the world good!

If I'm a bad tester/citizen/member of the community, we can presume I'm a leech who only takes takes takes, and I won't be reporting the bugs. As such, the shadow graph is useless to me, until you release the new version and I just complain about problems.

So maybe next time, give me a way to disable the shadow graph (or whatever new feature), or output to STDERR so I can filter it and decide whether to be good tester or bad tester. Since Terraform state embeds the Terraform version, I can't just downgrade to $PRIOR_VERSION without fiddling around with tfstate.

Today, I'm a bad tester, and just need to get stuff done, but I can't see what TF is doing.
Sometimes, I am the good tester who submits PRs.

❤️

@mitchellh
Copy link
Contributor Author

@mrwacky42 You bring up good points. We've actually had great success with people reporting crashing bugs where the crash log is written to a file. I wonder if we can do the same thing here.

I'm going to think on it and promise to ship a better user experience here. Thanks!

@ghost
Copy link

ghost commented Apr 20, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 20, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants