Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

frontend proto changes #169

Merged
merged 6 commits into from
Oct 27, 2022
Merged

frontend proto changes #169

merged 6 commits into from
Oct 27, 2022

Conversation

jainvipin
Copy link
Contributor

Frontend proto updates to incorporates comments in #133, and merges NvmeSubsystem, NvmeController and NvmeNamespace objects. A few notes

  • comments beginning with //?? are really going to be removed but kept for the context and discussions for now
  • have included id as an object key to each object, forward object referencing is so easy with that; this is by far the biggest semantic change in this PR
  • rearranged some enums, assuming no production/compliant implementations will run into compatibility issues.
  • fields are separated by a newline, but we can remove it if the coding guidelines needs to keep it next to each other

Signed-off-by: Vipin Jain jainvipin@gmail.com

@jainvipin jainvipin requested a review from a team as a code owner October 21, 2022 07:29
Signed-off-by: Vipin Jain <jainvipin@gmail.com>
string model_number = 4;

//?? do we need this in subsystem given that controller object would have it
//?? val change
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need this since there could be several controllers in subsystem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

//?? replaces: int64 id = 1;
common.v1.Uuid id = 1;

//?? is this name needed if uuid is uniquely identifying the controller
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see use in the name field of the controller

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in updated commit

// replaces: int64 id = 1;
common.v1.Uuid id = 1;

//?? do we need this if NvmeNamespace is identified using unique uuid
string name = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see use in the name field of the namespace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in updated commit

string model_number = 3;
int64 max_ns = 4;
// object's unique identifier
common.v1.Uuid id = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we wanted to use common.v1.ObjectKey instead of common.v1.Uuid as opaque object's unique identifier
common.v1.Uuid should be used only in places per NVME spec that defines true UUID/NGUID/EU64 for example...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes a commit on this PR coming to move these all keys to reflect that.

int64 id = 1;
// object's unique identifier
//?? replaces: int64 id = 1;
common.v1.Uuid id = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we wanted to use common.v1.ObjectKey instead of common.v1.Uuid as opaque object's unique identifier
common.v1.Uuid should be used only in places per NVME spec that defines true UUID/NGUID/EU64 for example...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

int64 id = 1;
// namespace's unique key
// replaces: int64 id = 1;
common.v1.Uuid id = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we wanted to use common.v1.ObjectKey instead of common.v1.Uuid as opaque object's unique identifier
common.v1.Uuid should be used only in places per NVME spec that defines true UUID/NGUID/EU64 for example...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that should serve generically as well. So, I am good with an opaque string instead of uuid. let's hear from others if anyone wants to opine on it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 from me. I'm not entirely convinced the server shouldn't be the one generating the handles, but I'm fine with moving forward in this direction for now.


//?? can we take this out if controller is referring to a subsystem anyways
//?? replaces string subsystem_id = 3;
common.v1.Uuid subsystem_id = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we wanted to use common.v1.ObjectKey instead of common.v1.Uuid as opaque object's unique identifier

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A namespace goes in a subsystem, not in a controller. So this is correct to refer to subsystem. But with dispersed namespaces a namespace can be in multiple subsystems so this shouldn't be in the namespace object. It should just be in the call that adds this namespace to a subsystem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you say that ? I think NS belongs to ctrl not subsystem
see

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A namespace gets assigned to a subsystem in NVMe, not a controller. This is directly from the NVMe spec and that's how your diagram shows it too. The namespaces are in the orange box, not in the blue boxes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so ctrl only have NSID inside ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, that's complicated. There's actually two types of NSID - private and public. Public NSIDs are inside the subsystem, and all controllers see the same set. Private NSIDs are inside the controller. You can think of a private NSID as an extra layer of indirection that aliases the public NSID to a private value, although strictly speaking the private NSID does not have to also appear in the public list.

I've never seen any implementation ever use a private NSID. Certainly, lots of implementations change which public NSIDs are visible to hosts based on access control, but I've never seen an implementation use a private NSID. I don't believe they actually add any functionality, security, or value generally. I do not think we need to support private NSIDs in OPI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Contributor Author

@jainvipin jainvipin Oct 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed the object-id in updated commit.


// key of the PCIe controller object that will host this namespace.
//?? replaces int64 controller_id = 4;
common.v1.Uuid controller_id = 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we wanted to use common.v1.ObjectKey instead of common.v1.Uuid as opaque object's unique identifier

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A namespace is not associated with 1 controller. Instead, controllers can view namespaces. Having this reference here is not correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NS is attached to 1 or more controllers

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. A namespace is placed in a subsystem.

Copy link
Contributor Author

@jainvipin jainvipin Oct 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will change - to rename this to objectKey instead.

common.v1.Uuid uuid = 11;

//?? what is this for
string multipath = 12;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no use for now, wanted to reserve for future, can remove now after discussion

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pulled from back/middle, not configured here. Remove this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

string multipath = 12;

//?? what is this for
string authentication = 13;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no use for now, wanted to reserve for future, can remove now after discussion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

// for live migration
//?? replaces: int64 nsid = 5;
uint32 host_nsid = 5;

string bdev = 6;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to discuss this bdev as well... I think should be reference to either middle-end object (like encryption, compression, ...) or to the back-end object, like nvmeof bdev... wdyt ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glimchb do you mean replace bdev with objectKey ?

Copy link
Member

@glimchb glimchb Oct 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, maybe... objectKey that points to a backend object

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs to be replaced with however OPI references the middle/backend objects. Probably a volume handle/id/key of some sort.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I am not sure about the use for it. if not we can remove/move it to the backend objects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in recent commit - all uuids are now objectKey.

string authentication = 13;

// reference to encryption key for the data at rest encryption
common.v1.Uuid crypto_key_id = 14;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this belong here ? I think should be reference to either middle-end object (like encryption, compression, ...) or to the back-end object, like nvmeof bdev... wdyt ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless we're talking PCI bus encryption, this should be part of the back/middle.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • backend is we talking about PSK's for TLS for example
  • middleend is we talking about data at rest

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, ultimately encryption granularity will need to be specified somewhere, so I put this in here. We can move this to ObjectKey instead, like in other objects; so this is really an attachment point for crypto, not the middleend layer, but a glue.

@glimchb
Copy link
Member

glimchb commented Oct 21, 2022

@jkogutINTC @benlwalker @orendu @llabordehpe please re-review this as well

// for live migration
//?? replaces: int64 nsid = 5;
uint32 host_nsid = 5;

string bdev = 6;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs to be replaced with however OPI references the middle/backend objects. Probably a volume handle/id/key of some sort.

string bdev = 6;

// Block size in bytes, must be power of 2 and must be less than the max
// io size supported. Typically tested values are 512, and 4k.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still like to see size and capacity dropped from the front-end. Those are instead implied by the backing device. Even if we were to add support for emulating one block size on top of another that's most likely done in the middle end.

Copy link
Member

@glimchb glimchb Oct 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if those are optional fields that if left empty - taken from backend, I don;t mind leaving them for now here and re-consider again once we defined middle end. baby steps

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Emulation is often much trickier than it seems, so my suggested baby step is just delete these two things for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with @benlwalker that these should be dropped. If not, then it needs to be clarified to determine with takes precedence - values from backend object or values from the api. Also for scalar types since the ‘ default/value not set’ would be 0 it could also be a valid value, so it's not entirely harmful. I think the comment is also relevant for the fields: optimal_write_size, write_granularity below

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if @jainvipin have a use case for having them, I incline to leave them
Agree with @mkalderon that comment to clarify if those provided, they overpower the backend device properties and those are omitted (or negative values like -1) then properties are taken from the backend device

string authentication = 13;

// reference to encryption key for the data at rest encryption
common.v1.Uuid crypto_key_id = 14;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless we're talking PCI bus encryption, this should be part of the back/middle.

// this to regulate IO size. Must be a multiple of the preferred write
// granularity. Must not exceed the controller maximum IO size value
// configured in the nvme agent config file.
uint32 optimal_write_size = 15;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pulled from back/middle, not configured here. Remove this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the intent here is what you present to the HOST over PCIe

// preferred write granularity hint to the host driver. Host IO
// stack may use this to align IO sizes to the write granularity for
// optimum performance.
uint32 pref_write_granularity= 16;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pulled from back/middle, not configured here. Remove this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the intent here is what you present to the HOST over PCIe

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not for the front-end, or even an administrator, to decide. It's based on the BE/ME device reporting this value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but what if implementation inside DPU/IPU for NVMe emulation has constrains of it's own on top of BE/ME device reporting ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then those constraints do not need to be configured over OPI and can just be set up and reported by the device


// key of the PCIe controller object that will host this namespace.
//?? replaces int64 controller_id = 4;
common.v1.Uuid controller_id = 4;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A namespace is not associated with 1 controller. Instead, controllers can view namespaces. Having this reference here is not correct.

// maximum length for NQN field
NSV_NVME_SUBSYSTEM_NQN_LEN = 256;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've said elsewhere, there's likely no reason to expose a subsystem object in the front-end. An xPU will probably never expose more than 1 controller per subsystem, so the two objects can be grouped into a "device" object. But if someone really has a use case for this in an xPU then I'm listening.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we see use cases with more then 1 controller per subsystem

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree @benlwalker - but I kept it since this was already an object and so moved a few things in there. So I am good removing this all together from the front end objects - however some of the fields included in Subsystem object can be folded into the NvmeController object then.

@@ -38,36 +39,146 @@ service NVMeNamespaceService {
rpc NVMeNamespaceStats (NVMeNamespaceStatsRequest) returns (NVMeNamespaceStatsResponse) {}
}

enum NvmeFrontEndConstants {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't these all spec defined values?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, what do you suggest ?
we can define them like this https://github.com/linux-nvme/libnvme/blob/master/src/nvme/types.h#L99-L129

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest dropping this enum since we know all of the values by spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if they are, but even if they do, putting the here may serve explicitly for different current/future versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest dropping this enum since we know all of the values by spec.

specifications also have versions, having this allows code to work on it via APIs, isn't it? Since these fields are exchanged via a versioned APIs, wouldn't it be better to keep it as part of API, instead of defining this in both client and server. Ultimately the client and server would need to have some checks on the length.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we expecting a single xPU to support different NVMe versions? Setting the version would be a clearer way to communicate that, rather than these lengths. Or are you suggesting some xPUs have stricter limits on the lengths of these values than the spec does, and this is for the xPU to report that back up to the client?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this brings up a question - do we need to have NMVE spec version xpu supports ? like 1.4 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps- let's do in a separate PR then after more discussions/thought.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

// SRIOV Virtual function within the Device and Physical function.
// Set to 0 for Physical Function. Virtual Function numbering starts from 1
uint32 virtual_function = 4;
}

// Path Management Selection
enum NvmePathMgmtType {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what these do? Multipath is almost certainly a property of the backend and shouldn't be in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes multipathing is a backend property (separate PR). however this is allowing selecting the path to be used by a given frontend Namespace object. So think of this as an association to a backend definition.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that makes any sense. You create the backend object, which includes the possible paths, and then you assign that object to the front-end. This should be removed from the front-end configuration.

// maximum Number of namespaces that will be provisioned under
// the controller.
//?? val change
uint32 max_ns = 7;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem necessary. It's in the subsystem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we keep it here and take it out of the subsystem; it also depends on if we take out the subsystem object all together.

Signed-off-by: Vipin Jain <jainvipin@gmail.com>
Signed-off-by: Vipin Jain <jainvipin@gmail.com>
Signed-off-by: Vipin Jain <jainvipin@gmail.com>
@glimchb glimchb added the Storage APIs or code related to storage area label Oct 21, 2022
…ys in request/response messages

Signed-off-by: Vipin Jain <jainvipin@gmail.com>
@jainvipin
Copy link
Contributor Author

jainvipin commented Oct 24, 2022

pushed the final set of changes - pls review @benlwalker @glimchb @mkalderon @orendu

common.v1.Uuid uuid = 10;

// reference to encryption key for the data at rest encryption
common.v1.ObjectKey crypto_key_id = 11;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be more generic, this can be renamed to backend object id
type is ok here
once renamed, can remove the field bdev above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure @glimchb - did you mean middle-end or the backend object-id? What we need to decide is if frontned objects will have reference to a generic bucket of middleend services (e.g. crypto, etc.) or one reference per frontend object say encryption granularity may make sense at a different level than compression or other middle-end services.

Copy link
Member

@glimchb glimchb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now from my perspective
One last nit about crypto_key_id inline
Not clicking "approve" button only because want to give till the end of the week for other people to comment and if no objection, will merge on Thursday

…message

Signed-off-by: Vipin Jain <jainvipin@gmail.com>
@glimchb glimchb added the Merge Candidate in the open merge window, next candidate for merge label Oct 25, 2022
string nqn = 2;

// serial number must not exceed 'NSV_CTRLR_SERIAL_NO_LEN' bytes
string serial_number = 3;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is NSV_CTRLR_SERIAL_NO_LEN etc.. defined in the end somewhere?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we postponed NSV_CTRLR_SERIAL_NO_LEN to next PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so should it still be part of comment ? not critical.. just made me look for it :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, ok... yes, need to remove the comment

string firmware_revision = 6;

// FRU identfier, 16bytes opaque identity for the type of unit
bytes fru_guid = 7;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jainvipin aren't the firmware_revision and fru_guid part of the controller properties and not the subsystem ?

// maximum host IO queue pairs allowed, value will default to
// limits in PCI device configuration; if set to 0 or more it
// will default to maximum permitted per xPU's capability
uint32 max_io_qps = 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this limits the ability to define seperate CQ, SQ and attach several SQs to a single CQs as spec permits.
Perhaps should be split ?
also - perhaps we should add the max_queue_depth param ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest if you can please create new PR @mkalderon with both additions:

  • split CQ and SQ
  • add max_queue_depth

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@glimchb
Copy link
Member

glimchb commented Oct 27, 2022

Approved during the API meeting nobody objected. Merging now

@glimchb glimchb merged commit c9df395 into opiproject:main Oct 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Merge Candidate in the open merge window, next candidate for merge Storage APIs or code related to storage area
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants