-
Notifications
You must be signed in to change notification settings - Fork 62
[nexus] Make snapshot-create saga unwind-safe #2093
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
||
| #[cfg(test)] | ||
| mod test { | ||
| pub(crate) mod test { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm mostly modifying this so I can re-use some of the test functions which validates "disk creation doesn't leave detritus".
Most of these checks are identical for snapshot creation.
| // Unfortunately, for our idempotency checks, checking for a "clean | ||
| // slate" gets more expensive when we need to compare region allocations | ||
| // between the disk and the snapshot. If we can undo the snapshot | ||
| // provisioning AND delete the disk together, these checks are much | ||
| // simpler to write. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This detail matters - basically, it's easy to check "nothing exists in the DB", and harder to check "the state which was provisioned for a disk exists and matches, but nothing else was added".
Unwinding the snapshot AND deleting the disk gets us back to the simpler "nothing should exist" state.
| // NOTE: If we later create a volume record and delete it, the | ||
| // running snapshot may be deleted (see: | ||
| // ssc_create_volume_record_undo). | ||
| // | ||
| // To cope, we treat "running snapshot not found" as "Ok", since it | ||
| // may just be the result of the volume deletion steps completing. | ||
| .or_else(|err| match err { | ||
| ErrorResponse(r) if r.status() == StatusCode::NOT_FOUND => { | ||
| Ok(()) | ||
| } | ||
| _ => Err(err), | ||
| })?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this a little quirky; figured I'd call it out.
- If we haven't run the later "volume creation" undo step, we need to cleanup this running snapshot.
- If we have run the later "volume creation" undo step, it's already gone
Part of #2052