Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nvme o tcp poc #1884

Merged
merged 9 commits into from Feb 9, 2024
Merged

Nvme o tcp poc #1884

merged 9 commits into from Feb 9, 2024

Conversation

ogayot
Copy link
Member

@ogayot ogayot commented Jan 12, 2024

This PR adds basic support for NVMe over TCP, with the following limits:

  • only partitions that are not critical for booting (I hard-coded /home currently) can be placed on remote storage

This PR depends on https://code.launchpad.net/~ogayot/curtin/+git/curtin/+merge/458446 for proper NVMe-over-TCP detection.

For testing, you can s/simple/nvme-over-tcp/ in the makefile and then use make dryrun

Some things to address before merging:

  • Add tests
  • Ensure we do not attempt to create an efistub on a remote drive
  • Ensure that NVMe drives are gray-ed out in the guided screen

I also wanted to ensure that API endpoints would enforce restrictions on remote storage, but since we are going to change the restrictions in the near future, I don't think it is worth the effort here. Desktop is also not the main use-case for NVMe over TCP AFAICT.

Signed-off-by: Olivier Gayot <olivier.gayot@canonical.com>
@ogayot ogayot force-pushed the nvme-o-tcp-poc branch 4 times, most recently from 9c173b5 to b8c2431 Compare January 19, 2024 15:05
@ogayot ogayot marked this pull request as ready for review January 19, 2024 16:09
snapcraft.yaml Outdated Show resolved Hide resolved
@@ -704,12 +707,21 @@ class Dasd:
preserve: bool = False


@fsobj("nvme_controller")
class NVMeController:
transport: str
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pretty finite list, correct? An Enum would be nice.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we should consider what happens if there is a NVMe-oF devices of some other kind on the system.

Copy link
Member Author

@ogayot ogayot Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm not really sure what the other values could be. "rdma" would probably be one of them but I haven't tested it.

Curtin's storage config will not complain if it encounters a value that it does not know about (it only knows about "tcp" really, not even "pcie"). Instead of raising a ValueError if the value for transport is unknown, I'd rather attempt the install. wdyt?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with your plan here

Copy link
Collaborator

@mwhudson mwhudson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi this is looking really good. A few niggles though!

@@ -79,6 +79,16 @@ def enable_common_mountpoints(self):
if isinstance(opt.value, str):
opt.enabled = opt.value in common_mountpoints

def disable_all_mountpoints_but_home(self):
"""Currently, filesystems of non-local disks can only be mounted at
/home."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment isn't quite right as users can type whatever they like into the "other" box, including things like / and /boot. Is it worth displaying a warning like we do when the user mounts a partition at say / without reformatting it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, updated the comment and added a warning message:

2024-01-26T14:52:04,155473089+01:00

if device:
ofstype = device.original_fstype()
if ofstype:
self.existing_fs_type = ofstype
initial_path = initial.get("mount")
if not initial_path and remote_storage:
initial["mount"] = "/home"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably shouldn't do this if there is already a mount for /home?

Copy link
Member Author

@ogayot ogayot Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's not ideal. But since we show an error message in the form when a mount already exists for the mountpoint, I think it's not that bad.

2024-01-26T14:24:13,147115123+01:00

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that's OK

@@ -117,6 +117,8 @@ def desc(device):
def _desc_disk(disk):
if disk.multipath:
return _("multipath device")
if disk.on_remote_storage():
return _("NVMe/TCP drive")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a bit like an easy thing to forget if we add iscsi or nvme-oF or whatever; I think you should poke under the covers a bit more here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with a default value of "remote drive". Currently no scenario should lead to this value being displayed since only NVMe over TCP drives are considered remote.

Copy link
Collaborator

@mwhudson mwhudson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm happy with this now. Obviously can't land until curtin changes land.

Signed-off-by: Olivier Gayot <olivier.gayot@canonical.com>
Signed-off-by: Olivier Gayot <olivier.gayot@canonical.com>
Signed-off-by: Olivier Gayot <olivier.gayot@canonical.com>
Signed-off-by: Olivier Gayot <olivier.gayot@canonical.com>
Signed-off-by: Olivier Gayot <olivier.gayot@canonical.com>
When trying to delete a partition using the answers-based mechanism,
subiquity tries to call .done() on the ConfirmDeletesStretchy overlay.
However, this method does not exist. The .confirm() method is what we
should use instead.

Signed-off-by: Olivier Gayot <olivier.gayot@canonical.com>
Signed-off-by: Olivier Gayot <olivier.gayot@canonical.com>
Signed-off-by: Olivier Gayot <olivier.gayot@canonical.com>
@ogayot
Copy link
Member Author

ogayot commented Feb 8, 2024

The curtin bits are merged now. @mwhudson it's just missing your approval (since you requested changes earlier) before I can merge. Thanks!

Copy link
Collaborator

@mwhudson mwhudson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry forgot to set the radio button earlier apparently!

@ogayot ogayot merged commit 24f48f0 into canonical:main Feb 9, 2024
12 checks passed
@ogayot ogayot deleted the nvme-o-tcp-poc branch February 9, 2024 11:19
@ogayot ogayot mentioned this pull request Feb 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants