Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unrolling saga actions for external IPs and NICs #1474

Merged
merged 2 commits into from Jul 22, 2022

Conversation

bnaecker
Copy link
Collaborator

Previously, for NICs and external IPs, the saga action that actually
created the records and the corresponding undo were not together in a
saga node. The undo was associated with a preceding action that created
UUIDs for the records. That was because each action may have created
multiple records, where the count was not known when we create the saga
template, and so the forward action performed multiple fallible steps.
To unwind it completely, including reverting the creation of N
resources, in the case where the N + 1th failed, we put the undo in the
preceding step. That worked, but was certainly confusing and bad for
maintenance, since the action and its undo were in different nodes. It
also resulted in a ton of extra complexity in some of the downstream
queries, which now had to be aware of the fact that we might replay a
partially-played saga node. This commit undoes all that.

  • Unroll the instance external IP address query
  • Unroll the network interface creation query
  • Actually delete NICs when we delete the instance, and add test for
    that
  • Simplify network interface query. An additional CTE detecting the
    partial-saga replay case is no longer needed, since we don't run the
    saga that way.

Previously, for NICs and external IPs, the saga action that actually
created the records and the corresponding undo were not together in a
saga node. The undo was associated with a preceding action that created
UUIDs for the records. That was because each action may have created
multiple records, where the count was not known when we create the saga
template, and so the forward action performed multiple fallible steps.
To unwind it completely, including reverting the creation of N
resources, in the case where the N + 1th failed, we put the undo in the
preceding step. That worked, but was certainly confusing and bad for
maintenance, since the action and its undo were in different nodes. It
also resulted in a ton of extra complexity in some of the downstream
queries, which now had to be aware of the fact that we might replay a
partially-played saga node. This commit undoes all that.

- Unroll the instance external IP address query
- Unroll the network interface creation query
- Actually delete NICs when we delete the instance, and add test for
  that
- Simplify network interface query. An additional CTE detecting the
  partial-saga replay case is no longer needed, since we don't run the
  saga that way.
@bnaecker bnaecker requested a review from smklein July 21, 2022 17:23
@bnaecker
Copy link
Collaborator Author

Resolves #1466

nexus/src/app/sagas/instance_create.rs Outdated Show resolved Hide resolved
nexus/src/app/sagas/instance_create.rs Outdated Show resolved Hide resolved
nexus/src/app/sagas/instance_create.rs Show resolved Hide resolved
nexus/src/app/sagas/instance_create.rs Show resolved Hide resolved
- Create UUIDs in separate saga actions for NICs and external IPs
- Remove remaining cruft from NIC-creation query that was required to
  handle the previous saga implementation.
@bnaecker bnaecker requested a review from smklein July 22, 2022 17:24
@bnaecker bnaecker enabled auto-merge (squash) July 22, 2022 18:12
@bnaecker bnaecker merged commit 8d13230 into main Jul 22, 2022
@bnaecker bnaecker deleted the unroll-saga-action-loops branch July 22, 2022 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants