Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

terraform: Plans can be "complete" and "applyable" #34642

Merged
merged 1 commit into from Feb 9, 2024

Conversation

apparentlymart
Copy link
Member

@apparentlymart apparentlymart commented Feb 9, 2024

This introduces new flags in plans.Plan that indicate whether the modules runtime considers each plan to be "applyable" and "complete".

These ideas are both already implied by some logic elsewhere in the system, but until now we didn't have the decision logic centralized in a single place that could therefore evolve over time without necessarily always updating every caller together.

We'll now have the modules runtime produce its own boolean ruling about each characteristic, which callers can rely on for the mechanical decision-making of whether to offer the user an "approve" prompt, and whether to remind the user after apply that it was an incomplete plan that will probably therefore need at least one more plan/apply round to converge.

The Applyable flag directly replaces the previous method Plan.CanApply, with equivalent logic. Making this a field instead of a method means that we can freeze it as part of a saved plan, rather than recalculating it when we reload the plan, and we can export the field value in our export formats like JSON while ensuring it'll always be consistent with what Terraform is using internally.

Callers can (and should) still use other context in the plan to return more tailored messages for specific situations they already know about that might be useful to users, but with these flags as a baseline callers can now just fall back to a generic presentation when encountering a situation they don't yet understand, rather than making the wrong decision and causing something strange to happen. That is: a lack of awareness of a new rule will now cause just a generic message in the UI, rather than incorrect behavior.

This commit mostly just deals with populating the flags, and then all of the direct consequences of that on our various tests. Further changes to actually make use of these flags elsewhere in the system will follow in later commits, both in this repository and in other repositories.


This is a bit of side-quest on the continuing path towards closing #30937. This particular PR doesn't do anything directly about that issue, but I'm expecting to continue this in later PRs as follows:

  • Once we've explicitly introduced the idea of "deferred actions" into the modules runtime, the presence of at least one of those will be an additional reason to set plan.Complete to false, in addition to the current rule that at least one -target argument causes that.

    -target is essentially an older attempt at the goals we're intending to solve with "deferred actions", so I'm intentionally treating these both together as reasons to consider a plan to be "incomplete", because both have the implication that one should try another plan/apply round after applying the incomplete plan in order to make more progress toward convergence.

  • The stacks runtime currently has a slightly more simplistic idea of "applyable" that just means "plan completed without errors". I'm intending to make that consistent with the modules runtime's definition of that idea, meaning that a stack plan where all of the components are unapplyable would also be unapplyable. That means that a wrapping workflow orchestrator should not prompt the operator for approval of the plan, and should not allow autonomous approval of the plan through the orchestration features.

  • The stacks runtime also currently has no concept of a plan being "incomplete". We'll first want to translate that idea to the overall stack plan: a stack plan is incomplete if any of the component instances in it have incomplete plans.

    Once we have the deferred actions code landed, the stack runtime should also use the "incomplete" flag from one component instance to populate the "force everything deferred" flag when planning any downstream component instances, since otherwise we would not properly respect the order of operations between components.

I split this one out into its own PR just because of how noisy it got with all of the test updates and such, but taken alone it doesn't really achieve much other than just adding a couple new flags to the terraform show -json PLANFILE output, which automations could then potentially use to replace any specific logic they are already using to infer the applyability and completeness from other characteristics of the plan.

These ideas are both already implied by some logic elsewhere in the system,
but until now we didn't have the decision logic centralized in a single
place that could therefore evolve over time without necessarily always
updating every caller together.

We'll now have the modules runtime produce its own boolean ruling about
each characteristic, which callers can rely on for the mechanical
decision-making of whether to offer the user an "approve" prompt, and
whether to remind the user after apply that it was an incomplete plan
that will probably therefore need at least one more plan/apply round to
converge.

The "Applyable" flag directly replaces the previous method Plan.CanApply,
with equivalent logic. Making this a field instead of a method means that
we can freeze it as part of a saved plan, rather than recalculating it
when we reload the plan, and we can export the field value in our export
formats like JSON while ensuring it'll always be consistent with what
Terraform is using internally.

Callers can (and should) still use other context in the plan to return
more tailored messages for specific situations they already know about
that might be useful to users, but with these flags as a baseline callers
can now just fall back to a generic presentation when encountering a
situation they don't yet understand, rather than making the wrong decision
and causing something strange to happen. That is: a lack of awareness of
a new rule will now cause just a generic message in the UI, rather than
incorrect behavior.

This commit mostly just deals with populating the flags, and then all of
the direct consequences of that on our various tests. Further changes to
actually make use of these flags elsewhere in the system will follow in
later commits, both in this repository and in other repositories.
// so maybe we should check plan.Changes.IsEmpty here and use a more
// generic fallback message if the plan doesn't seem to be empty.
// There are currently no other cases though, so we'll keep it
// simple for now.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my edification: aside from errors (handled above), what are some other potential reasons that a non-empty plan could be non-applyable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, there are none, which is why I didn't go to the trouble of complicating this. But since the idea of this was to give us the flexibility to adjust the definition of "applyable" in future -- for reasons we have not yet imagined -- this comment is intended as an aid to the hypothetical future person who is trying to do that and noticing that Terraform CLI is returning a confusing error message in whatever new situation they've introduced.


I'm making a tradeoff here that we've made before but I wasn't very explicit about: we typically assume it's okay for stuff in this codebase to make possibly-short-sighted assumptions about Terraform Core's abstractions because we can always change both e.g. Terraform CLI and Terraform Core together in a single commit if we need to. Whereas subsystems outside of this codebase, like Terraform Cloud, ought to follow the abstraction more closely since we typically not guarantee that they'll upgrade in lockstep with Terraform CLI/Core.

So when Terraform Cloud is relying on this, I would expect it to:

  • Use this flag alone to decide whether or not to show the "approve plan" button, or the automated equivalents thereof.
  • If the plan is also marked as errored then show the errors as an explanation for why the plan is not applyable.
  • If the plan is not errored, and all of the changes are no-op changes, then show "no changes" as the explanation for why the plan is not applyable.
  • If neither of the above conditions applies, show a generic message saying the plan is not applyable, which then serve as a stopgap for any situation where Terraform Core upgrades sooner than Terraform Cloud does, such as when a customer is running an older release of Terraform Enterprise.

The above logic is what the comment you were replying to is gesturing at. If I were writing Terraform CLI fresh today I'd have made it use a similar decision tree as the above, but because it lives in the same codebase as Terraform Core we can more reasonably assume that we'll update Terraform CLI at the same time as adding any hypothetical third case for a plan not being applyable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to merge this now just because it seems like if we want to do anything in response to this comment I could do so in a separate PR, and I have some other commits building on this one that I'd like to try to finish up today. Please let me know if there's something you'd like to change about this, and I'd be happy to do it in a followup PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, not at all! Thanks for the clear explanation.

@apparentlymart apparentlymart merged commit 884e1fb into main Feb 9, 2024
7 checks passed
Copy link

github-actions bot commented Feb 9, 2024

Reminder for the merging maintainer: if this is a user-visible change, please update the changelog on the appropriate release branch.

Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 11, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants