Skip to content

Conversation

@smklein
Copy link
Collaborator

@smklein smklein commented Dec 23, 2022

  • Makes snapshot creation unwind-safe
  • Adds tests to validate that unwinding the saga does not leave detritus

Part of #2052

@smklein smklein changed the title Snapshot create saga idempotent Snapshot create saga unwind-safe Dec 23, 2022
@smklein smklein changed the title Snapshot create saga unwind-safe [nexus] Make snapshot-create saga unwind-safe Dec 26, 2022
@smklein smklein marked this pull request as ready for review December 26, 2022 19:05

#[cfg(test)]
mod test {
pub(crate) mod test {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mostly modifying this so I can re-use some of the test functions which validates "disk creation doesn't leave detritus".

Most of these checks are identical for snapshot creation.

Comment on lines +1400 to +1404
// Unfortunately, for our idempotency checks, checking for a "clean
// slate" gets more expensive when we need to compare region allocations
// between the disk and the snapshot. If we can undo the snapshot
// provisioning AND delete the disk together, these checks are much
// simpler to write.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This detail matters - basically, it's easy to check "nothing exists in the DB", and harder to check "the state which was provisioned for a disk exists and matches, but nothing else was added".

Unwinding the snapshot AND deleting the disk gets us back to the simpler "nothing should exist" state.

@smklein smklein requested a review from leftwo December 26, 2022 19:15
Comment on lines +777 to +788
// NOTE: If we later create a volume record and delete it, the
// running snapshot may be deleted (see:
// ssc_create_volume_record_undo).
//
// To cope, we treat "running snapshot not found" as "Ok", since it
// may just be the result of the volume deletion steps completing.
.or_else(|err| match err {
ErrorResponse(r) if r.status() == StatusCode::NOT_FOUND => {
Ok(())
}
_ => Err(err),
})?;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this a little quirky; figured I'd call it out.

  • If we haven't run the later "volume creation" undo step, we need to cleanup this running snapshot.
  • If we have run the later "volume creation" undo step, it's already gone

Base automatically changed from snapshot-delete-saga to main December 28, 2022 20:48
@smklein smklein enabled auto-merge (squash) December 29, 2022 06:37
@smklein smklein merged commit 329ca35 into main Dec 29, 2022
@smklein smklein deleted the snapshot-create-saga-idempotent branch December 29, 2022 07:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants