Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement NVMe driver #1203

Closed
wkozaczuk opened this issue Jul 28, 2022 · 6 comments · Fixed by #1317
Closed

Implement NVMe driver #1203

wkozaczuk opened this issue Jul 28, 2022 · 6 comments · Fixed by #1317

Comments

@wkozaczuk
Copy link
Collaborator

This is one of the blocks of work to allow running OSv on KVM-based EC2 instances.

Here are some helpful resources:

Testing this should not be to difficult as QEMU supports NVMe.

Refs #924

@wkozaczuk wkozaczuk changed the title Implement an NVMe driver Implement NVMe driver Jul 28, 2022
@wkozaczuk
Copy link
Collaborator Author

My "archeology" effort revealed that @imp implemented NVMe driver for OSv in 2015 but it looks like the branch and his fork -https://github.com/imp/osv/tree/nvme - are sadly gone :-(

@imp
Copy link
Contributor

imp commented Nov 7, 2023

@wkozaczuk Wow! That work still lives in https://gitlab.com/osv/osv/-/commits/nvme/. Although I am not sure how applicable that code would be after all those years.

@wkozaczuk
Copy link
Collaborator Author

Thanks a lot, @imp!

I have very recently learned that one of the students has been working on adding NVMe support to OSv as part of his Bachelor thesis. So we may have 2 implementations soon!

As you point out yours may be a little outdated but many ideas and concepts are not. It will for sure be useful to look at it for comparison at least.

Please keep this branch around if you do not mind.

@Meandres
Copy link
Contributor

Meandres commented Nov 15, 2023

I am currently using a NVMe driver on OSv that we implemented based on uNVMe. It works for my use case but it is not included in the OSv devfs nor in the vfs. Therefore, for now, I rely on calling the drivers functions directly from the application.
You can find the code at https://github.com/Meandres/osv/tree/master/drivers especially the *nvme* files and pt.cpp (this is very hacky aswell and would need to be integrated to use OSv memory subsystem).
I have tested this with QEMU emulated device and I am currently using it with pci-passthrough. Both run fine except that there seems to be some scalability issues. I am not sure at the moment if this driver is the problem or not.

@wkozaczuk
Copy link
Collaborator Author

wkozaczuk commented Jan 8, 2024

@Meandres Thanks, I will take a look.

@wkozaczuk
Copy link
Collaborator Author

BTW we also have another PR implementing nvme - #1284

wkozaczuk added a commit to wkozaczuk/osv that referenced this issue Jun 18, 2024
This patch implements the NVMe block device driver. It is greatly based
on the pull request submitted by Jan Braunwarth (see
cloudius-systems#1284) so most credit goes
to Jan.

As his PR explains, OSv can be started with emulated NVMe disk on QEMU
like so:

./scripts/run.py --nvme

Compared to the Jan's PR, this patch is different in following ways:
- removes all non-NVMe changes (various bug fixes or ioctl enhancements
  are part of separate PRs)
- replaces most of the heap allocations by using stack which should
  reduce some contention
- tweaks PRP-handling code to use lock-less ring buffer which should
  further reduce contention when allocating memory
- fixes a bug in I/O queue CQ handling to correctly determine if SQ is
  not full
- assumes single namespace - 1 (most logic to deal with more has been
  preserved)
- reduces I/O queue size to 64 instead of 256
- makes code a little more DRY

Please note that, as Jan points out, the block cache logic of splitting
reads and writes into 512-byte requests causes very poor performance when
stress testing at devfs level. However, this behavior is not NVMe
specific and does not affect most applications that go through a
VFS and filesystem driver (ZFS, EXT, ROFS) which use the strategy()
method which does not use block cache.

Based on my tests, the NVMe read performance (IOPs and
bytes/s) is 60-70% of the virtio-blk on QEMU. I am not sure how much
that is because of this implementation of the NVMe driver or it is
because virtio-blk is by design much faster than anything emulated
including NVMe.

Closes cloudius-systems#1203

Signed-off-by: Jan Braunwarth <jan.braunwarth@gmail.com>
Signed-off-by: Waldemar Kozaczuk <jwkozaczuk@gmail.com>
wkozaczuk added a commit that referenced this issue Jun 22, 2024
This patch implements the NVMe block device driver. It is greatly based
on the pull request submitted by Jan Braunwarth (see
#1284) so most credit goes
to Jan.

As his PR explains, OSv can be started with emulated NVMe disk on QEMU
like so:

./scripts/run.py --nvme

Compared to the Jan's PR, this patch is different in following ways:
- removes all non-NVMe changes (various bug fixes or ioctl enhancements
  are part of separate PRs)
- replaces most of the heap allocations by using stack which should
  reduce some contention
- tweaks PRP-handling code to use lock-less ring buffer which should
  further reduce contention when allocating memory
- fixes a bug in I/O queue CQ handling to correctly determine if SQ is
  not full
- assumes single namespace - 1 (most logic to deal with more has been
  preserved)
- reduces I/O queue size to 64 instead of 256
- makes code a little more DRY

Please note that, as Jan points out, the block cache logic of splitting
reads and writes into 512-byte requests causes very poor performance when
stress testing at devfs level. However, this behavior is not NVMe
specific and does not affect most applications that go through a
VFS and filesystem driver (ZFS, EXT, ROFS) which use the strategy()
method which does not use block cache.

Based on my tests, the NVMe read performance (IOPs and
bytes/s) is 60-70% of the virtio-blk on QEMU. I am not sure how much
that is because of this implementation of the NVMe driver or it is
because virtio-blk is by design much faster than anything emulated
including NVMe.

Closes #1203

Signed-off-by: Jan Braunwarth <jan.braunwarth@gmail.com>
Signed-off-by: Waldemar Kozaczuk <jwkozaczuk@gmail.com>
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants