Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix panic when changing untargetted provider versions #15716

Merged
merged 1 commit into from Mar 20, 2024

Conversation

Frassle
Copy link
Member

@Frassle Frassle commented Mar 17, 2024

Description

Fixes #15704.

When doing a targeted run the source evaluator isn't aware of targets but it is responsible for registering default providers. As such on getting a resource event with a new provider version (e.g 1.0 -> 2.0) it will send of a registration for the new version it's seen, which as a default provider the step generator will accept and add to state (this is probably ok).
However when the step generator runs for the resource using this provider it will see it's not targetted and ignore its new goal state just reusing its old state. This old state will be referring to the old version of the provider (e.g "default_aws_1_0_0" rather than "default_aws_2_0_0"), which was causing a panic in the step generator when trying to build the overall stack state for StackAnalyze as the old provider had never been registered.

We now catch this situation when generating the same step for a non-targeted resource and error out that this isn't supported.

Checklist

  • I have run make tidy to update any new dependencies
  • I have run make lint to verify my code passes the lint check
    • I have formatted my code using gofumpt
  • I have added tests that prove my fix is effective or that my feature works
  • I have run make changelog and committed the changelog/pending/<file> documenting my change
  • Yes, there are changes in this PR that warrants bumping the Pulumi Cloud API version

@pulumi-bot
Copy link
Contributor

pulumi-bot commented Mar 17, 2024

Changelog

[uncommitted] (2024-03-19)

Bug Fixes

  • [engine] Fix a panic when updating provider version in a run using --target
    #15716

@Frassle Frassle force-pushed the fraser/fixProviderVersionPanic branch 2 times, most recently from 2e26b32 to 1928c3a Compare March 18, 2024 14:32
@Frassle Frassle marked this pull request as ready for review March 18, 2024 14:58
@Frassle Frassle requested a review from a team as a code owner March 18, 2024 14:58
tgummerer
tgummerer previously approved these changes Mar 18, 2024
Copy link
Collaborator

@tgummerer tgummerer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! So I guess it was #15476 that caused this, which was first in 3.109.0.

@Frassle
Copy link
Member Author

Frassle commented Mar 18, 2024

So I guess it was #15476 that caused this, which was first in 3.109.0.

yeh maybe but I suspect things were wrong before that anyway because the state would of incorrectly said the resource was using the new provider version.

@Frassle Frassle force-pushed the fraser/fixProviderVersionPanic branch 3 times, most recently from b8da548 to 72614f3 Compare March 19, 2024 17:44
@Frassle Frassle force-pushed the fraser/fixProviderVersionPanic branch from 72614f3 to 63f2d89 Compare March 19, 2024 18:25
@Frassle
Copy link
Member Author

Frassle commented Mar 19, 2024

So I've spent a couple of days on this and it's a gnarly one. Best I've got so far that's consistent is to just turn this into an error case rather than a panic.

It feels like the engine should be able to deal with this, but the current structure of the source evaluator and step generator makes that very hard, maybe impossible.

The source evaluator has no knowledge of --targets (or URNs for that matter), so when it gets the register resource request with the new provider version it doesn't know any better than to just register it onwards. The first part of that is registering the default provider for new provider version, and setting that as the "provider" field on the goal state that's sent to the step generator.

The step generator then does take --targets and old state into account, but the step generator both doesn't have any connection back to the source evaluator and default provider registrations but also more importantly isn't re-entrant. So even if we did pass a pointer to the default provider manager to the step generator it can't call into to start the default provider registration of the old provider version.

All in all, feels like we ought to be able to do something here to support this but I can't see what.

Copy link
Collaborator

@tgummerer tgummerer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a good solution for at least providing a better error to the user for now. Might be worth opening a separate issue as a followup for making the user experience better here if we can in the future?

@Frassle Frassle added this pull request to the merge queue Mar 20, 2024
Merged via the queue into master with commit 08aad59 Mar 20, 2024
48 checks passed
@Frassle Frassle deleted the fraser/fixProviderVersionPanic branch March 20, 2024 10:28
github-merge-queue bot pushed a commit that referenced this pull request Mar 27, 2024
Tentative changelog:

### Features

- [docs] Implement constructor syntax examples for every resource in
typescript, python, csharp and go
  [#15624](#15624)

- [engine] Send output values with property dependency information to
transform functions
  [#15637](#15637)

- [engine] Add a --continue-on-error flag to pulumi destroy
  [#15727](#15727)

- [sdk/go] Make `property.Map` keyed by `string` not `MapKey`
  [#15767](#15767)

- [sdk/python] Improve the error message when depends_on is passed
objects of the wrong type
  [#15737](#15737)


### Bug Fixes

- [auto/{go,nodejs,python}] Make sure to read complete lines before
trying to deserialize them as engine events
  [#15778](#15778)

- [cli/plugin] Fix installing local language plugins on Windows
  [#15715](#15715)

- [engine] Don't delete stack outputs on failed deployments
  [#15754](#15754)

- [engine] Fix a panic when updating provider version in a run using
--target
  [#15716](#15716)

- [engine] Handle that Assets & Archives can be returned from providers
without content.
  [#15736](#15736)

- [engine] Fix the engine trying to delete a protected resource caught
in a replace chain
  [#15776](#15776)

- [sdkgen/docs] Add missing newline for `Coming soon!`
  [#15783](#15783)

- [programgen/dotnet] Fix generated code for a list of resources used in
resource option DependsOn
  [#15773](#15773)

- [programgen/{dotnet,go}] Fixes emitted code for object expressions
assigned to properties of type Any
  [#15770](#15770)

- [sdk/go] Fix lookup of plugin and program dependencies when using Go
workspaces
  [#15743](#15743)

- [sdk/nodejs] Export automation.tag.TagMap type
  [#15774](#15774)

- [sdk/python] Wait only for pending outputs in the Python SDK, not all
pending asyncio tasks
  [#15744](#15744)


### Miscellaneous

- [sdk/nodejs] Reorganize function serialization tests
  [#15753](#15753)

- [sdk/nodejs] Move mockpackage tests to closure integration tests
  [#15757](#15757)
@justinvp justinvp mentioned this pull request Mar 27, 2024
github-merge-queue bot pushed a commit that referenced this pull request Mar 28, 2024
Tentative changelog:

### Features

- [docs] Implement constructor syntax examples for every resource in
typescript, python, csharp and go
  [#15624](#15624)

- [docs] Implement YAML constructor syntax examples in the docs
  [#15791](#15791)

- [engine] Send output values with property dependency information to
transform functions
  [#15637](#15637)

- [engine] Add a --continue-on-error flag to pulumi destroy
  [#15727](#15727)

- [sdk/go] Make `property.Map` keyed by `string` not `MapKey`
  [#15767](#15767)

- [sdk/nodejs] Make function serialization work with typescript 4 and 5
  [#15761](#15761)

- [sdk/python] Improve the error message when depends_on is passed
objects of the wrong type
  [#15737](#15737)


### Bug Fixes

- [auto/{go,nodejs,python}] Make sure to read complete lines before
trying to deserialize them as engine events
  [#15778](#15778)

- [cli/plugin] Fix installing local language plugins on Windows
  [#15715](#15715)

- [engine] Don't delete stack outputs on failed deployments
  [#15754](#15754)

- [engine] Fix a panic when updating provider version in a run using
--target
  [#15716](#15716)

- [engine] Handle that Assets & Archives can be returned from providers
without content.
  [#15736](#15736)

- [engine] Fix the engine trying to delete a protected resource caught
in a replace chain
  [#15776](#15776)

- [sdkgen/docs] Add missing newline for `Coming soon!`
  [#15783](#15783)

- [programgen/dotnet] Fix generated code for a list of resources used in
resource option DependsOn
  [#15773](#15773)

- [programgen/{dotnet,go}] Fixes emitted code for object expressions
assigned to properties of type Any
  [#15770](#15770)

- [sdk/go] Fix lookup of plugin and program dependencies when using Go
workspaces
  [#15743](#15743)

- [sdk/nodejs] Export automation.tag.TagMap type
  [#15774](#15774)

- [sdk/python] Wait only for pending outputs in the Python SDK, not all
pending asyncio tasks
  [#15744](#15744)


### Miscellaneous

- [sdk/nodejs] Reorganize function serialization tests
  [#15753](#15753)

- [sdk/nodejs] Move mockpackage tests to closure integration tests
  [#15757](#15757)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pulumi versions after v3.108.1 sometimes crashes/panics when using --target
3 participants