Local storage 2/4: Add the new disk type #9409

jmpesp · 2025-11-17T20:20:14Z

Add the ability to create LocalStorage type Disks.

From the API this means that the DiskCreate struct has been changed into an enum, with variants for Crucible and LocalStorage. LocalStorage disks will be attached and detached just like Crucible disks, but will (currently) not support any of the other operations on a disk (snapshot, import, etc).

Note that this is a breaking change to the Nexus external API - see nexus/types/src/external_api/params.rs for more on this. Note too that content = "content" was added to InstanceDiskAttachment because otherwise the type wouldn't pass the openapi lint step.

A separate "local storage dataset allocation" row will be created during instance allocation in the next PR, slicing a portion of the zpool's local storage dataset. This will be delegated to the Propolis zone and used as file backed block storage for an NVMe device. Using a zvol this way was both measured to be the fastest in terms of 4k IOPs, and have the least amount of standard deviation in those IOPs measurements, according to fio runs.

Add the ability to create LocalStorage type Disks. From the API this means that the DiskCreate struct has been changed into an enum, with variants for Crucible and LocalStorage. LocalStorage disks will be attached and detached just like Crucible disks, but will (currently) not support any of the other operations on a disk (snapshot, import, etc). Note that this is a breaking change to the Nexus external API - see nexus/types/src/external_api/params.rs for more on this. Note too that `content = "content"` was added to `InstanceDiskAttachment` because otherwise the type wouldn't pass the openapi lint step. A separate "local storage dataset allocation" row will be created during instance allocation in the next PR, slicing a portion of the zpool's local storage dataset. This will be delegated to the Propolis zone and used as file backed block storage for an NVMe device. Using a zvol this way was both measured to be the fastest in terms of 4k IOPs, and have the least amount of standard deviation in those IOPs measurements, according to fio runs.

jmpesp · 2025-11-18T18:08:37Z

#
# Download the Oxide CLI and images from catacomb.
#
pfexec mkdir -p /usr/oxide
pfexec curl -sSfL -o /usr/oxide/oxide \
    http://catacomb.eng.oxide.computer:12346/oxide-v0.1.1
pfexec chmod +x /usr/oxide/oxide

the helios / deploy job grabs what seems like a very old oxide cli, and this PR has breaking changes for endpoints, so that job is failing

hawkw

Overall, this makes sense and is pretty straightforward. I had a bunch of obnoxious nitpicks, sorry! 😅

dev-tools/omdb/src/bin/omdb/db.rs

hawkw · 2025-11-19T19:34:32Z

dev-tools/omdb/src/bin/omdb/db.rs

+        let Some(instance) = instances.into_iter().next() else {
+            bail!("no instance: {} found", instance_uuid);
+        };
+
+        let instance_name = instance.instance().name().to_string();
+
+        if instance.vmm().is_some() {
+            let propolis_id =
+                instance.instance().runtime().propolis_id.unwrap();
+            let my_sled_id = instance.sled_id().unwrap();
+
+            let (_, my_sled) = LookupPath::new(opctx, datastore)
+                .sled_id(my_sled_id)
+                .fetch()
+                .await
+                .context("failed to look up sled")?;


in the cases where the instance doesn't exist, or the sled doesn't exist, that's probably an invalid state. but since OMDB is a debugging tool, i think it would be nicer if we didn't bail out here and just printed that we couldn't find the instance/sled and keep going so that we can still get partial output. OMDB ought to be useable in cases where we have put nonsensical state in the database, if possible.

dev-tools/omdb/src/bin/omdb/db.rs

hawkw · 2025-11-19T19:38:05Z

nexus/db-model/src/disk_type_local_storage.rs

+        // Don't unwrap this - the size of this disk is a parameter set by an
+        // API call, and we don't want to panic on out of range input.
+        let required_dataset_overhead =
+            external::ByteCount::try_from(overhead)?;


i think we should probably add some kind of context to this error in the case that it bubbles up to the user. the TryFrom error will say "value is too small for a byte count" or "too large for a byte count", but we should tell the user that it's not the byte count they provided for the disk's size, it's something we computed based on that.

I think so too, but there's currently no way that this would get out to a user from a saga (without querying for specific node's outputs, which would be brittle to changes in the saga itself). The way we've structured things is that (after some validation of the parameters) the disk create saga takes in the params::DiskCreate from the user and does all the work of creating a disk in the saga, and we only return a 500 if saga execution fails.

I put an additional step in validate_disk_create_params that tries to create this record outside the saga, and can return a better error if there's a failure in 135f6a9.

nexus/db-queries/src/db/datastore/disk.rs

nexus/src/app/instance.rs

nexus/src/app/sagas/disk_create.rs

hawkw · 2025-11-19T20:13:42Z

nexus/tests/integration_tests/local_storage.rs

+static PROJECT_NAME: &str = "springfield-squidport";
+
+fn get_project_selector() -> String {
+    format!("project={}", PROJECT_NAME)


turbo-nit:

Suggested change

format!("project={}", PROJECT_NAME)

format!("project={PROJECT_NAME}")

nexus/tests/integration_tests/local_storage.rs

hawkw

new error messages and such look great, thanks!

hawkw · 2025-11-21T23:24:07Z

nexus/src/app/disk.rs

            .lookup_node_output::<db::datastore::Disk>("created_disk")
-            .map_err(|e| Error::internal_error(&format!("{:#}", &e)))
+            .map_err(|e| Error::InteralError {
+                internal_message: format!("{:#}", &e),


nit:

Suggested change

internal_message: format!("{:#}", &e),

internal_message: format!("{e:#}",),

(surprised clippy didn't whine about that, i guess the & kept it off your trail...)

hawkw · 2025-11-21T23:24:28Z

nexus/src/app/disk.rs

+                        internal_message: format!(
+                            "error sending bulk write to pantry: {}",
+                            e,


again i think clippy would likely prefer:

Suggested change

internal_message: format!(

"error sending bulk write to pantry: {}",

e,

internal_message: format!(

"error sending bulk write to pantry: {e}",

ahl · 2025-11-24T19:33:46Z

nexus/types/src/external_api/params.rs

 }

-/// Different sources for a disk
+/// Different sources for a Crucible Disk


Suggested change

/// Different sources for a Crucible Disk

/// Different sources for a Distributed Disk

I don't think we want to educate customers about code names generally

ahl · 2025-11-24T19:34:44Z

nexus/types/src/external_api/params.rs

 /// Describe the instance's disks at creation time
 #[derive(Clone, Debug, Deserialize, Serialize, JsonSchema)]
-#[serde(tag = "type", rename_all = "snake_case")]
+#[serde(tag = "type", content = "content", rename_all = "snake_case")]


why make this adjacently tagged rather than internally tagged as it previously was?

Unfortunately the openapi linter fails if I don't!

thread 'main' panicked at /home/james/.cargo/git/checkouts/openapi-lint-d70c5ef8a437b633/ef442ee/src/lib.rs:142:48: not yet implemented: complex 'any' schema not handled AnySchema {

Did you investigate that? Can you send me the full output please?

ahl · 2025-11-24T19:36:30Z

nexus/types/src/external_api/params.rs

+#[serde(tag = "type", rename_all = "snake_case")]
+pub enum DiskCreate {
+    Crucible {
+        /// The common identifying metadata for the disk
+        #[serde(flatten)]
+        identity: IdentityMetadataCreateParams,
+
+        /// The initial source for this disk
+        disk_source: DiskSource,
+
+        /// The total size of the Disk (in bytes)
+        size: ByteCount,
+    },
+
+    LocalStorage {
+        /// The common identifying metadata for the disk
+        #[serde(flatten)]
+        identity: IdentityMetadataCreateParams,
+
+        /// The total size of the Disk (in bytes)
+        size: ByteCount,
+    },
+}


I'd suggestion:

let's make this back into a struct

identity and size is going to be common on any kind of disk

we could have a new enum like this that we `#[serde(flatten)]:

#[serde(tag = "disk_type")] enum DiskDetails { InstanceLocal { // There are no local storage parameters... ... ... ... yet. }, #[serde(untagged)] Distributed { disk_type: Option<TypeThatCanOnlyBeTheStringDistributed>, disk_source: DiskSource, } }

having disk_type in the untagged Distributed variant allows for backward compat (although we'll probably need to update drift)

One could imagine how 3rd-party storage might fit in

Is there some way / does there need to be some way to attach an instance to an existing local disk? Presumably one would want to do this at instance-create time

enum DiskDetails

I also tried this way originally (using the same name even!), but after talking, @david-crespo and I decided on using the higher level enum to push the logical difference to the highest level possible, ending up with the change to DiskCreate shown here.

having disk_type in the untagged Distributed variant allows for backward compat (although we'll probably need to update drift)

This is backwards compatible, but for versioning I'd rather either get #9430 merged or use the new DiskCreate enum with a separate /v2/ path instead - for the Disk model there was a backwards compatible change to add the disk_type field but I can see that changing in the future too (for example maybe we change Disk from a struct to an enum instead of having Optional fields for all the Crucible stuff?).

Is there some way / does there need to be some way to attach an instance to an existing local disk?

Attach and detach just work as they're based only on disk id, and the previous PR to this changed the idea of Disk to apply to both types. Most user code won't have to change.

I know there was discussion about changing types with respect to SDK generation for other programming languages (e.g., Go) so I wanted to add some color. These rich enums with struct variants don't map well into Go and that'll likely never change given Go's stance on, well, everything.

That being said, we're already dealing with these sorts of enums in other parts of the API (e.g., DiskSource). We do that with a pattern called "fat struct" where the struct used for serialization/deserialization contains fields for all the enums variants. This works well overall but has some issues. For example, no type safety for a given variant and this pattern doesn't work if variants have the same field name of differing types. We're actively discussing how to improve the Go SDK to more reliably generate Go types from the OpenAPI specification in RFD 614.

I appreciate that people are thinking about the effects of their changes in SDK generation for other languages. I would recommend using the types that make sense for the server API code if it's going to be more correct and use established patterns on the server side. The SDK generation is a consideration, sure, but we will build more established patterns for these sorts of edge cases that don't map well into other languages.

I also tried this way originally (using the same name even!), but after talking, @david-crespo and I decided on using the higher level enum to push the logical difference to the highest level possible...

What was the rationale?

This is backwards compatible, but for versioning I'd rather either get #9430 merged or use the new DiskCreate enum with a separate /v2/ path instead

Why would you prefer to do that?

The v1 part of the path isn't really intended for that use, but we could add a v2 of the instance provision operation (instance_create_v2).

When you can make it work, I find it easier to think about a type that is cleanly A or B rather than some fields shared and some not. I also thought — it turns out wrongly — that that was more typical in our create params schemas. After seeing this comment I thought about it and realized that probably isn't true. I had Claude go through the OpenAPI schema and confirm it. So I am persuaded. Especially since it looks like the fields that go with Local are currently a subset of the ones on Distributed.

ALL HAIL CLAUDE!

I can't get the disk create struct with disk details enum to pass the openapi linter either, it's also failing with not yet implemented: complex 'any' schema not handled AnySchema.

We're actively working on better supporting pattern 2 in the Go SDK generator so I'm onboard with that pattern overall.

jmpesp added 3 commits November 17, 2025 20:04

print generic disk information for local storage disks

e141a46

missing index

f7c8bd7

hawkw reviewed Nov 19, 2025

View reviewed changes

jmpesp added 7 commits November 20, 2025 22:36

validate DiskTypeLocalStorage can be created before saga

135f6a9

internal_context and try not wrapping in Some

950ba73

more descriptive error, also owned string

e970dfe

Error::internal_error rampage

0fe00c8

what is it

658be0b

turbo

355efd8

wrap to 80

f3259d5

hawkw approved these changes Nov 21, 2025

View reviewed changes

jmpesp added 2 commits November 22, 2025 01:36

fmt and clippy

b67ddc4

Merge branch 'main' into local_storage_2_new_disk_type

3c268e2

ahl reviewed Nov 24, 2025

View reviewed changes

	format!("project={}", PROJECT_NAME)
	format!("project={PROJECT_NAME}")

	internal_message: format!("{:#}", &e),
	internal_message: format!("{e:#}",),

	/// Different sources for a Crucible Disk
	/// Different sources for a Distributed Disk

Local storage 2/4: Add the new disk type #9409

Are you sure you want to change the base?

Local storage 2/4: Add the new disk type #9409

Uh oh!

Conversation

jmpesp commented Nov 17, 2025

Uh oh!

jmpesp commented Nov 18, 2025

Uh oh!

hawkw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

hawkw left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants