Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[engine] Only record a resource's chosen alias. #9288

Merged
merged 2 commits into from Mar 28, 2022

Conversation

pgavlin
Copy link
Member

@pgavlin pgavlin commented Mar 24, 2022

As we discovered when removing aliases from the state entirely, the
snapshotter needs to be alias-aware so that it can fix up references to
resources that were aliased. After a resource operation finishes, the
snapshotter needs to write out a new copy of the snapshot. However, at
the time we write the snapshot, there may be resources that have not yet
been registered that refer to the just-registered resources by a
different URN due to aliasing. Those references need to be fixed up
prior to writing the snapshot in order to preserve the snapshot's
integrity (in particular, the property that all URNs refer to resources
that exist in the snapshot).

For example, consider the following simple dependency graph: A <-- B.
When that graph is serialized, B will contain a reference to A in its
dependency list. Let the next run of the program produces the graph A'
<-- B where A' is aliased to A. After A' is registered, the snapshotter
needs to write a snapshot that contains its state, but B must also be
updated so it references A' instead of A, which will no longer be in the
snapshot.

These changes take advantage of the fact that although a resource can
provide multiple aliases, it can only ever resolve those aliases to a
single resource in the existing state. Therefore, at the time the
statefile is fixed up, each resource in the statefile could only have
been aliased to a single old resource, and it is sufficient to store
only the URN of the chosen resource rather than all possible aliases. In
addition to preserving the ability to fix up references to aliased
resources, retaining the chosen alias allows the history of a logical
resource to be followed across aliases.

Fixes #9089.

@pgavlin pgavlin requested review from Frassle and a team March 24, 2022 21:49
pkg/engine/journal.go Outdated Show resolved Hide resolved
@@ -1476,26 +1476,6 @@ func TestAliases(t *testing.T) {
},
}}, []deploy.StepOp{deploy.OpSame, deploy.OpReplace, deploy.OpCreateReplacement, deploy.OpDeleteReplaced})

// ensure failure when different resources use duplicate aliases
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should still test ok now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an error we can no longer detect. The setup for this test is... odd. It starts from an empty base state and wants to validate that two resources cannot mention the same alias. The only way to do that from the statefile alone is to always store all of the aliases in the statefile. In order to support this check, we'd need to move the handling of that specific case into the step generator. I did this in a different branch, but I got a little bit more ambitious there and fixed a few other bugs I found along the way: https://github.com/pulumi/pulumi/compare/pgavlin/choose-single-alias

IMO, this specific check is relatively low-value, as it's really only covering the case where resources that have no base state cannot be mention the same alias. If the resources are both aliased to the same base state, then we will still issue an error (just added a test for that).

There is one interesting case that we don't catch in either case: given the base state [A], it is possible to register a resource A and a resource B that is aliased to A. This is bad, and we should fix it, but that requires fixing a nasty bug in provider aliasing. That bug is an easy fix, I think (a possible fix is included in that other branch), but I'd like to separate it out nonetheless.

@pgavlin pgavlin requested a review from a team March 25, 2022 17:06
@pgavlin
Copy link
Member Author

pgavlin commented Mar 25, 2022

(Not sure why the checks are showing up as mostly failed, but the latest run was successful., nvm--the checks updated with the latest results 👍) The only thing that this needs is a CHANGELOG entry.

@pgavlin
Copy link
Member Author

pgavlin commented Mar 25, 2022

I do want to elaborate on the idea here, too.

As we discovered when removing aliases from the state entirely, the snapshotter needs to be alias-aware so that it can fix up references to resources that were aliased. After a resource operation finishes, the snapshotter needs to write out a new copy of the snapshot. However, at the time we write the snapshot, there may be resources that have not yet been registered that refer to the just-registered resources by a different URN due to aliasing. Those references need to be fixed up prior to writing the snapshot in order to preserve the snapshot's integrity (in particular, the property that all URNs refer to resources that exist in the snapshot).

For example, consider the following simple dependency graph: A <-- B. When that graph is serialized, B will contain a reference to A in its dependency list. Let the next run of the program produces the graph A' <-- B where A' is aliased to A. After A' is registered, the snapshotter needs to write a snapshot that contains its state, but B must also be updated so it references A' instead of A, which will no longer be in the snapshot.

These changes take advantage of the fact that although a resource can provide multiple aliases, it can only ever resolve those aliases to a single resource in the existing state. Therefore, at the time the statefile is fixed up, each resource in the statefile could only have been aliased to a single old resource, and it is sufficient to store only the URN of the chosen resource rather than all possible aliases. In addition to preserving the ability to fix up references to aliased resources, retaining the chosen alias allows the history of a logical resource to be followed across aliases.

As we discovered when removing aliases from the state entirely, the
snapshotter needs to be alias-aware so that it can fix up references to
resources that were aliased. After a resource operation finishes, the
snapshotter needs to write out a new copy of the snapshot. However, at
the time we write the snapshot, there may be resources that have not yet
been registered that refer to the just-registered resources by a
different URN due to aliasing. Those references need to be fixed up
prior to writing the snapshot in order to preserve the snapshot's
integrity (in particular, the property that all URNs refer to resources
that exist in the snapshot).

For example, consider the following simple dependency graph: A <-- B.
When that graph is serialized, B will contain a reference to A in its
dependency list. Let the next run of the program produces the graph A'
<-- B where A' is aliased to A. After A' is registered, the snapshotter
needs to write a snapshot that contains its state, but B must also be
updated so it references A' instead of A, which will no longer be in the
snapshot.

These changes take advantage of the fact that although a resource can
provide multiple aliases, it can only ever resolve those aliases to a
single resource in the existing state. Therefore, at the time the
statefile is fixed up, each resource in the statefile could only have
been aliased to a single old resource, and it is sufficient to store
only the URN of the chosen resource rather than all possible aliases. In
addition to preserving the ability to fix up references to aliased
resources, retaining the chosen alias allows the history of a logical
resource to be followed across aliases.
@pgavlin pgavlin force-pushed the pgavlin/choose-single-alias-2 branch from c463547 to 3442463 Compare March 25, 2022 17:25
Copy link
Member

@justinvp justinvp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

pkg/resource/deploy/step_generator.go Show resolved Hide resolved
@pgavlin pgavlin merged commit bac1149 into master Mar 28, 2022
@pulumi-bot pulumi-bot deleted the pgavlin/choose-single-alias-2 branch March 28, 2022 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stack size issues with parented resources and aliases
4 participants