[engine] Clear pending operations with refresh. #8435

pgavlin · 2021-11-16T20:09:44Z

Just what it says on the tin. This is implemented by moving the check
for pending operations in the last statefile into the deployment
executor and making it conditional on whether or not a refresh is being
performed (either via pulumi refresh or pulumi up -r). Because
pending operations are not carried over from the base statefile, this
has the effect of clearing pending operations if a refresh is performed.

Part of #4265.

pgavlin · 2021-11-16T20:13:03Z

pending operations are not carried over from the base statefile

FWIW, this aspect of the implementation makes me a little uncomfortable. Another approach that I think is worth considering is something like this:

if a refresh is not being performed, retain the current behavior
otherwise:
a. scan the pending operations for creates. remove these operations and issue warnings re: potentially orphaned resources
b. retain all other pending operations and begin refreshing the stack
c. when a resource R is refreshed, clear any pending operation associated with R

That approach feels a little more well-founded, as it only drops information re: the pending operation once we've explicitly taken action to resolve the operation's results. That said, taking this PR does not prevent us from adopting the "stronger" approach in the future.

t0yv0 · 2021-11-17T16:50:43Z

Before diving in implementation options, can I get a bit more context. Sounds like the user is running into interrupted updates and using our IUR procedure I was just looking up:

** Interrupted Update Recovery

We have a documented procedure for interrupted update recovery:

https://www.pulumi.com/docs/troubleshooting/#interrupted-update-recovery

***  pulumi cancel

What does it to precisely?

Note that this operation is very dangerous, and may leave the stack in
an inconsistent state if a resource operation was pending when the
update was canceled.

*** `pulumi stack export | pulumi stack import`

This clears your state’s stack of all pending operations.

Example:

  warning: removing pending operation 'creating' on '...' from snapshot
  import successful.

*** Manual edit of cloud state

For every warning that this (export|import) command prints out, you
should verify with your cloud provider whether or not this operation
was successful. If the operation was successful, and a resource was
created, you should delete that resource using your cloud provider’s
console, CLI, or SDK

*** `pulumi refresh`

I understand this reconciles by copying cloud state to stack state
when they diverge.

Is the idea that after the change the IUR becomes simply pulumi refresh? Will this automate the manual edit of cloud state? Sounds like from the above description the case causing problems is when a resource was created but the pending create operation is recorded, so we it exists in the cloud but pending operation is the only record on Pulumi side about the resource being managed by Pulumi.

t0yv0 · 2021-11-17T17:29:38Z

I'm reading the #4265 and it's helpful. Luke's comments constrain the scope of the fix. I need to do some more work here and take it for a spin to understand the changes.

I like the "Another approach that I think is worth considering is something like this" as it's spelled out what's happening and I feel like I understand it.

I feel like I need to understand and step through the approach in the PR. Open questions (notes to self to verify):

when pending operations are dropped because pulumi refresh or pulumi up -r, are there still warnings for orphan resources
are normal (non-refresh) updates still blocked on pending operations or they continue rolling forward with a warning (as in Luke's comment in 4625)
double check the IUR procedure after this change
should we consider backlogging the more invasive work of fully automated recovery (including automatic removal of orphan resources), or has it been backlogged? It appears possible theoretically

pgavlin · 2021-11-17T21:04:28Z

when pending operations are dropped because pulumi refresh or pulumi up -r, are there still warnings for orphan resources

Not as written. This would probably be easy to do.

are normal (non-refresh) updates still blocked on pending operations or they continue rolling forward with a warning (as in Luke's comment in 4625)

Yes, they are still blocked on pending operations.

double check the IUR procedure after this change

This is the thrust of this PR. With these changes, the IUR procedure becomes "run pulumi refresh, or use pulumi up -r in all cases to avoid the possibility of errors due to pending operations".

Just what it says on the tin. This is implemented by moving the check for pending operations in the last statefile into the deployment executor and making it conditional on whether or not a refresh is being performed (either via `pulumi refresh` or `pulumi up -r`). Because pending operations are not carried over from the base statefile, this has the effect of clearing pending operations if a refresh is performed. Fixes #4265.

t0yv0

Changes and the test looking good!

I just really think we should send warnings when --refresh ignores maybe orphaned resources.

I tried adding this but printPendingOperationsError seems tied to Result which seems tied to early termination (in PR or quick follow up). Which API can we use to send warnings into the console for the user? Thanks.

lukehoban · 2021-11-19T17:02:43Z

are normal (non-refresh) updates still blocked on pending operations or they continue rolling forward with a warning (as in Luke's comment in 4625)

Yes, they are still blocked on pending operations.

I do not think I would consider #4265 resolved until we make this change. That is the key point of extreme friction in the current model.

If we want to land this refresh change first, that sounds reasonable, but I think we should leave #4265 open to track addressing the core user experience pain here.

Pavel910 · 2021-12-17T18:12:54Z

I agree with @lukehoban, Pulumi should handle the pending_operations internally on updates. It's a big pain for all of our users. The entire Webiny CMS is relying on Pulumi, and we keep getting questions about this exact issue. Solving this would be a huge step forward! 🍻 ❤️

Zaid-Ajaj · 2022-03-25T09:59:04Z

As discussed with @Frassle I've made a change to this PR where pending CREATE operations are not cleared upon a refresh and these will require user intervention to resolve potentially orphaned resources.

pgavlin requested review from t0yv0 and lukehoban November 16, 2021 20:09

pgavlin added 4 commits November 18, 2021 14:07

CL

e8f3d44

nil ref

87d9be6

adjust a test

88f9ca4

pgavlin force-pushed the pgavlin/refreshPendingOperations branch from a3060a1 to 88f9ca4 Compare November 18, 2021 22:07

t0yv0 approved these changes Nov 19, 2021

View reviewed changes

Zaid-Ajaj mentioned this pull request Mar 22, 2022

Implement "pulumi state clear-pending-operations" #9270

Closed

3 tasks

Zaid-Ajaj added 6 commits March 23, 2022 02:00

Merge branch 'master' into pgavlin/refreshPendingOperations

3f1c1ce

Return nil for the plan when executing deployment

db5ae82

fix lifecycle test

c142229

parallel test for TestRefreshWithPendingOperations

f6ad0b2

preserve pending create operations

b698d2d

Merge branch 'master' into pgavlin/refreshPendingOperations

eb67b17

Zaid-Ajaj merged commit 36c6533 into master Mar 25, 2022

pulumi-bot deleted the pgavlin/refreshPendingOperations branch March 25, 2022 09:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[engine] Clear pending operations with refresh. #8435

[engine] Clear pending operations with refresh. #8435

pgavlin commented Nov 16, 2021 •

edited by emiliza

pgavlin commented Nov 16, 2021

t0yv0 commented Nov 17, 2021

t0yv0 commented Nov 17, 2021

pgavlin commented Nov 17, 2021

t0yv0 left a comment

lukehoban commented Nov 19, 2021

Pavel910 commented Dec 17, 2021 •

edited

Zaid-Ajaj commented Mar 25, 2022

[engine] Clear pending operations with refresh. #8435

[engine] Clear pending operations with refresh. #8435

Conversation

pgavlin commented Nov 16, 2021 • edited by emiliza

pgavlin commented Nov 16, 2021

t0yv0 commented Nov 17, 2021

t0yv0 commented Nov 17, 2021

pgavlin commented Nov 17, 2021

t0yv0 left a comment

Choose a reason for hiding this comment

lukehoban commented Nov 19, 2021

Pavel910 commented Dec 17, 2021 • edited

Zaid-Ajaj commented Mar 25, 2022

pgavlin commented Nov 16, 2021 •

edited by emiliza

Pavel910 commented Dec 17, 2021 •

edited