Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Step generator panics #2223

Closed
hausdorff opened this issue Nov 19, 2018 · 5 comments
Closed

Step generator panics #2223

hausdorff opened this issue Nov 19, 2018 · 5 comments
Assignees
Labels
p1 A bug severe enough to be the next item assigned to an engineer
Milestone

Comments

@hausdorff
Copy link
Contributor

This happens when I updated the username of a GKE cluster.

$ p up
warning: A new version of Pulumi is available. To upgrade from version '0.16.3-dev-1541007180-gb7489357-dirty' to '0.16.4', visit https://pulumi.io/install for manual instructions and release notes.
Previewing update (istio):

     Type                            Name         Plan       Info
     pulumi:pulumi:Stack             istio-istio
 ~   ├─ gcp:container:Cluster        gke-cluster  update     [diff: ~masterAuth]
 ~   └─ pulumi:providers:kubernetes  gkeK8s       update..   [diff: ~kubeconfig]
panic: fatal: An assertion has failed

goroutine 163 [running]:
github.com/pulumi/pulumi/pkg/util/contract.failfast(0x1abcd20, 0x17)
	/Users/alex/go/src/github.com/pulumi/pulumi/pkg/util/contract/failfast.go:23 +0xe9
github.com/pulumi/pulumi/pkg/util/contract.Assert(0xc00182d100)
	/Users/alex/go/src/github.com/pulumi/pulumi/pkg/util/contract/assert.go:26 +0x49
github.com/pulumi/pulumi/pkg/resource/plugin.(*provider).Diff(0xc000b7da40, 0xc0018607e0, 0x6a, 0xc001081c40, 0x15, 0xc00137d6e0, 0xc0019a4d50, 0xc001238401, 0x0, 0x0, ...)
	/Users/alex/go/src/github.com/pulumi/pulumi/pkg/resource/plugin/provider_plugin.go:263 +0x3f2
github.com/pulumi/pulumi/pkg/resource/deploy.(*stepGenerator).diff(0xc000174310, 0xc0018607e0, 0x6a, 0xc001081c40, 0x15, 0xc00137d530, 0xc00137d6e0, 0xc0019a4d50, 0x1bf2d00, 0xc000b7da40, ...)
	/Users/alex/go/src/github.com/pulumi/pulumi/pkg/resource/deploy/step_generator.go:591 +0x121
github.com/pulumi/pulumi/pkg/resource/deploy.(*stepGenerator).GenerateSteps(0xc000174310, 0x72c02a8, 0xc0019ba180, 0x0, 0x0, 0x0, 0xc0017a5c00)
	/Users/alex/go/src/github.com/pulumi/pulumi/pkg/resource/deploy/step_generator.go:278 +0x1dbf
github.com/pulumi/pulumi/pkg/resource/deploy.(*planExecutor).handleSingleEvent(0xc0005e51c0, 0x1bdca40, 0xc0019ba180, 0xc0017a5cd8)
	/Users/alex/go/src/github.com/pulumi/pulumi/pkg/resource/deploy/plan_executor.go:211 +0x156
github.com/pulumi/pulumi/pkg/resource/deploy.(*planExecutor).Execute.func3(0xc0005e2720, 0xc0005e51c0, 0xc001613070, 0x1be9ba0, 0xc0005f4380, 0x1be9ba0, 0xc0005f4240, 0x10000, 0x1, 0xc0005d2280)
	/Users/alex/go/src/github.com/pulumi/pulumi/pkg/resource/deploy/plan_executor.go:171 +0x280
github.com/pulumi/pulumi/pkg/resource/deploy.(*planExecutor).Execute(0xc0005e51c0, 0x1be9ba0, 0xc0005f4240, 0x1be60e0, 0xc0005e6120, 0x7fffffff, 0x10000, 0x3030303030303001, 0x0, 0x0)
	/Users/alex/go/src/github.com/pulumi/pulumi/pkg/resource/deploy/plan_executor.go:187 +0x4c4
github.com/pulumi/pulumi/pkg/resource/deploy.(*Plan).Execute(0xc0005d8180, 0x1be9ba0, 0xc0005f4240, 0x1be60e0, 0xc0005e6120, 0x7fffffff, 0x10000, 0xc0003fea01, 0x1370000002f, 0x13700000000)
	/Users/alex/go/src/github.com/pulumi/pulumi/pkg/resource/deploy/plan.go:286 +0x87
github.com/pulumi/pulumi/pkg/engine.(*planResult).Walk.func1(0x1be60e0, 0xc0005e6120, 0xc0005e6090, 0x1be9ba0, 0xc0005f4240, 0x1, 0xc001612fd0, 0xc0005e2240)
	/Users/alex/go/src/github.com/pulumi/pulumi/pkg/engine/plan.go:214 +0xc9
created by github.com/pulumi/pulumi/pkg/engine.(*planResult).Walk
	/Users/alex/go/src/github.com/pulumi/pulumi/pkg/engine/plan.go:206 +0x114

cc @swgillespie @pgavlin @lukehoban

@pgavlin
Copy link
Member

pgavlin commented Nov 19, 2018

I just hit this earlier as well.

The root cause of this issue is that an input to a first-class provider changed, and the new value was not known at preview time. When a first-class provider has any unknown configuration values, we do not configure its underlying plugin. We do this in order to avoid unexpected behavior, as no provider plugin is currently able to handle unknown configuration values (there are plans to fix this; see #1718).

Prior to #2088, we would never attempt to use an unconfigured provider. Before those changes, any unknown input to a provider would cause the provider to require replacement during preview. This would cascade s.t. all of the resources managed by the provider also required replacement, which does not require a call to Diff. After #2088, we are attempting to call Diff on the unconfigured provider, which causes this assert.

Any fix here is fraught with user experience issues. We cannot call the underlying provider--it is not configured, and we cannot configure it--so whatever we return from Diff is going to be a guess. The most conservative approach would indicate replacement for any resource diffed by an unconfigured provider, but as we discovered with #2012 (which drove #2088), this is not a great user experience in the common case. The only reliable alternative I see is to indicate diffs but no replacement for any resource diffed by an unconfigured provider, even though this risks surprising the user with replacements at deployment time. As a temporary UX improvement, we could indicate something to the effect of "this diff is unknown" in the resource's status message.

The long-term fix for this is to implement the changes in #1718.

@lukehoban
Copy link
Member

Given that this causes a panic, it seems that it's net worse than prior to #2088, and that we will need to get some fix in place to avoid the panic and turn this back into a "preview UX" issue".

@lukehoban lukehoban added the p1 A bug severe enough to be the next item assigned to an engineer label Nov 19, 2018
@lukehoban lukehoban added this to the 0.19 milestone Nov 19, 2018
@pgavlin
Copy link
Member

pgavlin commented Nov 19, 2018

@lukehoban what did you think of this approach?

The only reliable alternative I see is to indicate diffs but no replacement for any resource diffed by an unconfigured provider, even though this risks surprising the user with replacements at deployment time. As a temporary UX improvement, we could indicate something to the effect of "this diff is unknown" in the resource's status message.

@lukehoban
Copy link
Member

That proposal sounds okay to me, especially if we can add a "warn" message to the status to indicate that there is an unknown input and we can't be sure whether replacement will be required.

How likely are we to be able to properly fix this via #1718 in M20?

@pgavlin
Copy link
Member

pgavlin commented Nov 19, 2018

How likely are we to be able to properly fix this via #1718 in M20?

We can prioritize that work if we believe that it is necessary. It's a pretty simple set of changes, so I doubt that it will take very long to implement the core pieces. The trickier work is in the various providers that want to support partial configuration.

We will probably want to improve the "diff is uncertain" UX when addressing that as-well, maybe going so far as to make it a first-class scenario.

pgavlin added a commit that referenced this issue Nov 21, 2018
After #2088, we began calling `Diff` on providers that are not configured
due to unknown configuration values. This hit an assertion intended to
detect exactly this scenario, which was previously unexpected.

These changes adjust `Diff` to indicate that a Diff is unavailable and
return an error message that describes why. The step generator then
interprets the diff as indicating a normal update and issues the error
message to the diagnostic stream.

Fixes #2223.
pgavlin added a commit that referenced this issue Nov 22, 2018
After #2088, we began calling `Diff` on providers that are not configured
due to unknown configuration values. This hit an assertion intended to
detect exactly this scenario, which was previously unexpected.

These changes adjust `Diff` to indicate that a Diff is unavailable and
return an error message that describes why. The step generator then
interprets the diff as indicating a normal update and issues the error
message to the diagnostic stream.

Fixes #2223.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p1 A bug severe enough to be the next item assigned to an engineer
Projects
None yet
Development

No branches or pull requests

3 participants