Skip to content

Conversation

@jmpesp
Copy link
Contributor

@jmpesp jmpesp commented Nov 17, 2025

Add the ability to create LocalStorage type Disks.

From the API this means that the DiskCreate struct has been changed into an enum, with variants for Crucible and LocalStorage. LocalStorage disks will be attached and detached just like Crucible disks, but will (currently) not support any of the other operations on a disk (snapshot, import, etc).

Note that this is a breaking change to the Nexus external API - see nexus/types/src/external_api/params.rs for more on this. Note too that content = "content" was added to InstanceDiskAttachment because otherwise the type wouldn't pass the openapi lint step.

A separate "local storage dataset allocation" row will be created during instance allocation in the next PR, slicing a portion of the zpool's local storage dataset. This will be delegated to the Propolis zone and used as file backed block storage for an NVMe device. Using a zvol this way was both measured to be the fastest in terms of 4k IOPs, and have the least amount of standard deviation in those IOPs measurements, according to fio runs.

Add the ability to create LocalStorage type Disks.

From the API this means that the DiskCreate struct has been changed into
an enum, with variants for Crucible and LocalStorage. LocalStorage disks
will be attached and detached just like Crucible disks, but will
(currently) not support any of the other operations on a disk (snapshot,
import, etc).

Note that this is a breaking change to the Nexus external API - see
nexus/types/src/external_api/params.rs for more on this. Note too that
`content = "content"` was added to `InstanceDiskAttachment` because
otherwise the type wouldn't pass the openapi lint step.

A separate "local storage dataset allocation" row will be created during
instance allocation in the next PR, slicing a portion of the zpool's
local storage dataset. This will be delegated to the Propolis zone and
used as file backed block storage for an NVMe device. Using a zvol this
way was both measured to be the fastest in terms of 4k IOPs, and have
the least amount of standard deviation in those IOPs measurements,
according to fio runs.
@jmpesp
Copy link
Contributor Author

jmpesp commented Nov 18, 2025

#
# Download the Oxide CLI and images from catacomb.
#
pfexec mkdir -p /usr/oxide
pfexec curl -sSfL -o /usr/oxide/oxide \
    http://catacomb.eng.oxide.computer:12346/oxide-v0.1.1
pfexec chmod +x /usr/oxide/oxide

the helios / deploy job grabs what seems like a very old oxide cli, and this PR has breaking changes for endpoints, so that job is failing

Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this makes sense and is pretty straightforward. I had a bunch of obnoxious nitpicks, sorry! 😅

Comment on lines +2322 to +2337
let Some(instance) = instances.into_iter().next() else {
bail!("no instance: {} found", instance_uuid);
};

let instance_name = instance.instance().name().to_string();

if instance.vmm().is_some() {
let propolis_id =
instance.instance().runtime().propolis_id.unwrap();
let my_sled_id = instance.sled_id().unwrap();

let (_, my_sled) = LookupPath::new(opctx, datastore)
.sled_id(my_sled_id)
.fetch()
.await
.context("failed to look up sled")?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the cases where the instance doesn't exist, or the sled doesn't exist, that's probably an invalid state. but since OMDB is a debugging tool, i think it would be nicer if we didn't bail out here and just printed that we couldn't find the instance/sled and keep going so that we can still get partial output. OMDB ought to be useable in cases where we have put nonsensical state in the database, if possible.

Comment on lines +49 to +52
// Don't unwrap this - the size of this disk is a parameter set by an
// API call, and we don't want to panic on out of range input.
let required_dataset_overhead =
external::ByteCount::try_from(overhead)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we should probably add some kind of context to this error in the case that it bubbles up to the user. the TryFrom error will say "value is too small for a byte count" or "too large for a byte count", but we should tell the user that it's not the byte count they provided for the disk's size, it's something we computed based on that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so too, but there's currently no way that this would get out to a user from a saga (without querying for specific node's outputs, which would be brittle to changes in the saga itself). The way we've structured things is that (after some validation of the parameters) the disk create saga takes in the params::DiskCreate from the user and does all the work of creating a disk in the saga, and we only return a 500 if saga execution fails.

I put an additional step in validate_disk_create_params that tries to create this record outside the saga, and can return a better error if there's a failure in 135f6a9.

static PROJECT_NAME: &str = "springfield-squidport";

fn get_project_selector() -> String {
format!("project={}", PROJECT_NAME)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

turbo-nit:

Suggested change
format!("project={}", PROJECT_NAME)
format!("project={PROJECT_NAME}")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new error messages and such look great, thanks!

.lookup_node_output::<db::datastore::Disk>("created_disk")
.map_err(|e| Error::internal_error(&format!("{:#}", &e)))
.map_err(|e| Error::InteralError {
internal_message: format!("{:#}", &e),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
internal_message: format!("{:#}", &e),
internal_message: format!("{e:#}",),

(surprised clippy didn't whine about that, i guess the & kept it off your trail...)

Comment on lines 611 to 613
internal_message: format!(
"error sending bulk write to pantry: {}",
e,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again i think clippy would likely prefer:

Suggested change
internal_message: format!(
"error sending bulk write to pantry: {}",
e,
internal_message: format!(
"error sending bulk write to pantry: {e}",

}

/// Different sources for a disk
/// Different sources for a Crucible Disk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Different sources for a Crucible Disk
/// Different sources for a Distributed Disk

I don't think we want to educate customers about code names generally

/// Describe the instance's disks at creation time
#[derive(Clone, Debug, Deserialize, Serialize, JsonSchema)]
#[serde(tag = "type", rename_all = "snake_case")]
#[serde(tag = "type", content = "content", rename_all = "snake_case")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why make this adjacently tagged rather than internally tagged as it previously was?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately the openapi linter fails if I don't!

thread 'main' panicked at /home/james/.cargo/git/checkouts/openapi-lint-d70c5ef8a437b633/ef442ee/src/lib.rs:142:48:
not yet implemented: complex 'any' schema not handled AnySchema {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you investigate that? Can you send me the full output please?

Comment on lines +1754 to +1776
#[serde(tag = "type", rename_all = "snake_case")]
pub enum DiskCreate {
Crucible {
/// The common identifying metadata for the disk
#[serde(flatten)]
identity: IdentityMetadataCreateParams,

/// The initial source for this disk
disk_source: DiskSource,

/// The total size of the Disk (in bytes)
size: ByteCount,
},

LocalStorage {
/// The common identifying metadata for the disk
#[serde(flatten)]
identity: IdentityMetadataCreateParams,

/// The total size of the Disk (in bytes)
size: ByteCount,
},
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggestion:

  • let's make this back into a struct
  • identity and size is going to be common on any kind of disk
  • we could have a new enum like this that we `#[serde(flatten)]:
#[serde(tag = "disk_type")]
enum DiskDetails {
    InstanceLocal {
        // There are no local storage parameters... ... ... ... yet.
    },
    #[serde(untagged)]
    Distributed {
        disk_type: Option<TypeThatCanOnlyBeTheStringDistributed>,
        disk_source: DiskSource,
    }
}
  • having disk_type in the untagged Distributed variant allows for backward compat (although we'll probably need to update drift)
  • One could imagine how 3rd-party storage might fit in
  • Is there some way / does there need to be some way to attach an instance to an existing local disk? Presumably one would want to do this at instance-create time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enum DiskDetails

I also tried this way originally (using the same name even!), but after talking, @david-crespo and I decided on using the higher level enum to push the logical difference to the highest level possible, ending up with the change to DiskCreate shown here.

having disk_type in the untagged Distributed variant allows for backward compat (although we'll probably need to update drift)

This is backwards compatible, but for versioning I'd rather either get #9430 merged or use the new DiskCreate enum with a separate /v2/ path instead - for the Disk model there was a backwards compatible change to add the disk_type field but I can see that changing in the future too (for example maybe we change Disk from a struct to an enum instead of having Optional fields for all the Crucible stuff?).

Is there some way / does there need to be some way to attach an instance to an existing local disk?

Attach and detach just work as they're based only on disk id, and the previous PR to this changed the idea of Disk to apply to both types. Most user code won't have to change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know there was discussion about changing types with respect to SDK generation for other programming languages (e.g., Go) so I wanted to add some color. These rich enums with struct variants don't map well into Go and that'll likely never change given Go's stance on, well, everything.

That being said, we're already dealing with these sorts of enums in other parts of the API (e.g., DiskSource). We do that with a pattern called "fat struct" where the struct used for serialization/deserialization contains fields for all the enums variants. This works well overall but has some issues. For example, no type safety for a given variant and this pattern doesn't work if variants have the same field name of differing types. We're actively discussing how to improve the Go SDK to more reliably generate Go types from the OpenAPI specification in RFD 614.

I appreciate that people are thinking about the effects of their changes in SDK generation for other languages. I would recommend using the types that make sense for the server API code if it's going to be more correct and use established patterns on the server side. The SDK generation is a consideration, sure, but we will build more established patterns for these sorts of edge cases that don't map well into other languages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also tried this way originally (using the same name even!), but after talking, @david-crespo and I decided on using the higher level enum to push the logical difference to the highest level possible...

What was the rationale?

This is backwards compatible, but for versioning I'd rather either get #9430 merged or use the new DiskCreate enum with a separate /v2/ path instead

Why would you prefer to do that?

The v1 part of the path isn't really intended for that use, but we could add a v2 of the instance provision operation (instance_create_v2).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you can make it work, I find it easier to think about a type that is cleanly A or B rather than some fields shared and some not. I also thought — it turns out wrongly — that that was more typical in our create params schemas. After seeing this comment I thought about it and realized that probably isn't true. I had Claude go through the OpenAPI schema and confirm it. So I am persuaded. Especially since it looks like the fields that go with Local are currently a subset of the ones on Distributed.

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ALL HAIL CLAUDE!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't get the disk create struct with disk details enum to pass the openapi linter either, it's also failing with not yet implemented: complex 'any' schema not handled AnySchema.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're actively working on better supporting pattern 2 in the Go SDK generator so I'm onboard with that pattern overall.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants