-
Notifications
You must be signed in to change notification settings - Fork 62
Reduce regions_hard_delete contention #7960
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce regions_hard_delete contention #7960
Conversation
Recently, regions_hard_delete changed to use a CTE to update the size_used column of the crucible_dataset table. Unfortunately this had the side effect of causing a massive amount of contention: the CTE would 1) delete rows from the regions table 2) read from the regions table during the CTE 3) update the size_used column for _all_ crucible_dataset rows This would almost certainly cause each invocation of the CTE to contend with each other, as seen when doing disk deletes in parallel. This commit changes the CTE to: 1) delete rows from the regions table, returning the affected datasets 2) read from the regions table during the CTE 3) update the size_used column for affected crucible_dataset rows only. This was tested by using terraform to create and tear down 90 disks, with a parallelism setting of 10, 20, and 30. Before this change, this would not work as Nexus would inevitably return 500s. Fixes oxidecomputer#7952
| .execute_async(&conn) | ||
| .await?; | ||
| let query = | ||
| regions_hard_delete::dataset_update_query(dataset_ids); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, makes a lot of sense to have this be much more scoped!
| use uuid::Uuid; | ||
|
|
||
| /// Update the affected Crucible dataset rows after hard-deleting regions | ||
| pub fn dataset_update_query( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could probably do with an EXPECTORATE + EXPLAIN test, to make it easier for future changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 done in 5522603
| for (idx, dataset_id) in dataset_ids.into_iter().enumerate() { | ||
| if idx != 0 { | ||
| builder.sql(","); | ||
| } | ||
| builder.param().bind::<sql_types::Uuid, _>(dataset_id); | ||
| } | ||
|
|
||
| builder.sql( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to use:
builder.param().bind::<diesel::pg::sql_types::Array<sql_types::Uuid>, _>(dataset_ids)Instead of creating a new bind parameter for each individual index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started with this, but hit a runtime error, something like couldn't map from uuid[] -> uuid - maybe I was doing it wrong? idk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thread 'db::queries::regions_hard_delete::test::explainable' panicked at nexus/db-queries/src/db/queries/regions_hard_delete.rs:104:14:
Failed to explain query - is it valid SQL?: DatabaseError(Unknown, "invalid cast: uuid[] -> uuid")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following works for me:
- crucible_dataset.id IN (",
- );
-
- for (idx, dataset_id) in dataset_ids.into_iter().enumerate() {
- if idx != 0 {
- builder.sql(",");
- }
- builder.param().bind::<sql_types::Uuid, _>(dataset_id);
- }
-
- builder.sql(
- ")
+ crucible_dataset.id = ANY (",
+ )
+ .param()
+ .bind::<diesel::pg::sql_types::Array<sql_types::Uuid>, _>(dataset_ids)
+ .sql(
+ ")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, yeah - I missed how IN is not the same as = ANY, changed in fc37358
Recently, regions_hard_delete changed to use a CTE to update the size_used column of the crucible_dataset table. Unfortunately this had the side effect of causing a massive amount of contention: the CTE would
This would almost certainly cause each invocation of the CTE to contend with each other, as seen when doing disk deletes in parallel.
This commit changes the CTE to:
This was tested by using terraform to create and tear down 90 disks, with a parallelism setting of 10, 20, and 30. Before this change, this would not work as Nexus would inevitably return 500s.
Fixes #7952