Reduce regions_hard_delete contention #7960

jmpesp · 2025-04-10T22:07:17Z

Recently, regions_hard_delete changed to use a CTE to update the size_used column of the crucible_dataset table. Unfortunately this had the side effect of causing a massive amount of contention: the CTE would

delete rows from the regions table
read from the regions table during the CTE
update the size_used column for all crucible_dataset rows

This would almost certainly cause each invocation of the CTE to contend with each other, as seen when doing disk deletes in parallel.

This commit changes the CTE to:

delete rows from the regions table, returning the affected datasets
read from the regions table during the CTE
update the size_used column for affected crucible_dataset rows only.

This was tested by using terraform to create and tear down 90 disks, with a parallelism setting of 10, 20, and 30. Before this change, this would not work as Nexus would inevitably return 500s.

Fixes #7952

Recently, regions_hard_delete changed to use a CTE to update the size_used column of the crucible_dataset table. Unfortunately this had the side effect of causing a massive amount of contention: the CTE would 1) delete rows from the regions table 2) read from the regions table during the CTE 3) update the size_used column for _all_ crucible_dataset rows This would almost certainly cause each invocation of the CTE to contend with each other, as seen when doing disk deletes in parallel. This commit changes the CTE to: 1) delete rows from the regions table, returning the affected datasets 2) read from the regions table during the CTE 3) update the size_used column for affected crucible_dataset rows only. This was tested by using terraform to create and tear down 90 disks, with a parallelism setting of 10, 20, and 30. Before this change, this would not work as Nexus would inevitably return 500s. Fixes oxidecomputer#7952

smklein · 2025-04-10T23:21:50Z

nexus/db-queries/src/db/datastore/region.rs

-                    .execute_async(&conn)
-                    .await?;
+                    let query =
+                        regions_hard_delete::dataset_update_query(dataset_ids);


Yup, makes a lot of sense to have this be much more scoped!

smklein · 2025-04-10T23:23:02Z

nexus/db-queries/src/db/queries/regions_hard_delete.rs

+use uuid::Uuid;
+
+/// Update the affected Crucible dataset rows after hard-deleting regions
+pub fn dataset_update_query(


This could probably do with an EXPECTORATE + EXPLAIN test, to make it easier for future changes.

👍 done in 5522603

smklein · 2025-04-10T23:27:58Z

nexus/db-queries/src/db/queries/regions_hard_delete.rs

+    for (idx, dataset_id) in dataset_ids.into_iter().enumerate() {
+        if idx != 0 {
+            builder.sql(",");
+        }
+        builder.param().bind::<sql_types::Uuid, _>(dataset_id);
+    }
+
+    builder.sql(


Would it be possible to use:

builder.param().bind::<diesel::pg::sql_types::Array<sql_types::Uuid>, _>(dataset_ids)

Instead of creating a new bind parameter for each individual index?

I started with this, but hit a runtime error, something like couldn't map from uuid[] -> uuid - maybe I was doing it wrong? idk

thread 'db::queries::regions_hard_delete::test::explainable' panicked at nexus/db-queries/src/db/queries/regions_hard_delete.rs:104:14: Failed to explain query - is it valid SQL?: DatabaseError(Unknown, "invalid cast: uuid[] -> uuid")

The following works for me:

- crucible_dataset.id IN (", - ); - - for (idx, dataset_id) in dataset_ids.into_iter().enumerate() { - if idx != 0 { - builder.sql(","); - } - builder.param().bind::<sql_types::Uuid, _>(dataset_id); - } - - builder.sql( - ") + crucible_dataset.id = ANY (", + ) + .param() + .bind::<diesel::pg::sql_types::Array<sql_types::Uuid>, _>(dataset_ids) + .sql( + ")

Nice, yeah - I missed how IN is not the same as = ANY, changed in fc37358

Original-Commit: 2fd10bb

jmpesp requested a review from smklein April 10, 2025 22:07

smklein reviewed Apr 10, 2025

View reviewed changes

expectorate and explain tests

5522603

smklein approved these changes Apr 11, 2025

View reviewed changes

bind to array, not individual

fc37358

jmpesp enabled auto-merge (squash) April 11, 2025 18:16

jmpesp merged commit 2fd10bb into oxidecomputer:main Apr 11, 2025
16 checks passed

jmpesp deleted the regions_hard_delete_cte_contention branch April 11, 2025 19:37

iliana pushed a commit that referenced this pull request Apr 11, 2025

rel/v14: Reduce regions_hard_delete contention (#7960)

775e40d

Original-Commit: 2fd10bb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reduce regions_hard_delete contention #7960

Reduce regions_hard_delete contention #7960

Uh oh!

jmpesp commented Apr 10, 2025

Uh oh!

smklein Apr 10, 2025

Uh oh!

smklein Apr 10, 2025

Uh oh!

jmpesp Apr 11, 2025 •

edited

Loading

Uh oh!

smklein Apr 10, 2025

Uh oh!

jmpesp Apr 11, 2025

Uh oh!

jmpesp Apr 11, 2025

Uh oh!

smklein Apr 11, 2025

Uh oh!

jmpesp Apr 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Reduce regions_hard_delete contention #7960

Reduce regions_hard_delete contention #7960

Uh oh!

Conversation

jmpesp commented Apr 10, 2025

Uh oh!

smklein Apr 10, 2025

Choose a reason for hiding this comment

Uh oh!

smklein Apr 10, 2025

Choose a reason for hiding this comment

Uh oh!

jmpesp Apr 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

smklein Apr 10, 2025

Choose a reason for hiding this comment

Uh oh!

jmpesp Apr 11, 2025

Choose a reason for hiding this comment

Uh oh!

jmpesp Apr 11, 2025

Choose a reason for hiding this comment

Uh oh!

smklein Apr 11, 2025

Choose a reason for hiding this comment

Uh oh!

jmpesp Apr 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jmpesp Apr 11, 2025 •

edited

Loading