Skip to content

Commit 1797020

Browse files
Ming Leiaxboe
authored andcommitted
ublk: document zero copy feature
Add words to explain how zero copy feature works, and why it has to be trusted for handling IO read command. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250327095123.179113-8-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
1 parent ebf695f commit 1797020

File tree

1 file changed

+26
-9
lines changed

1 file changed

+26
-9
lines changed

Documentation/block/ublk.rst

Lines changed: 26 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -309,18 +309,35 @@ with specified IO tag in the command data:
309309
``UBLK_IO_COMMIT_AND_FETCH_REQ`` to the server, ublkdrv needs to copy
310310
the server buffer (pages) read to the IO request pages.
311311

312-
Future development
313-
==================
314-
315312
Zero copy
316313
---------
317314

318-
Zero copy is a generic requirement for nbd, fuse or similar drivers. A
319-
problem [#xiaoguang]_ Xiaoguang mentioned is that pages mapped to userspace
320-
can't be remapped any more in kernel with existing mm interfaces. This can
321-
occurs when destining direct IO to ``/dev/ublkb*``. Also, he reported that
322-
big requests (IO size >= 256 KB) may benefit a lot from zero copy.
323-
315+
ublk zero copy relies on io_uring's fixed kernel buffer, which provides
316+
two APIs: `io_buffer_register_bvec()` and `io_buffer_unregister_bvec`.
317+
318+
ublk adds IO command of `UBLK_IO_REGISTER_IO_BUF` to call
319+
`io_buffer_register_bvec()` for ublk server to register client request
320+
buffer into io_uring buffer table, then ublk server can submit io_uring
321+
IOs with the registered buffer index. IO command of `UBLK_IO_UNREGISTER_IO_BUF`
322+
calls `io_buffer_unregister_bvec()` to unregister the buffer, which is
323+
guaranteed to be live between calling `io_buffer_register_bvec()` and
324+
`io_buffer_unregister_bvec()`. Any io_uring operation which supports this
325+
kind of kernel buffer will grab one reference of the buffer until the
326+
operation is completed.
327+
328+
ublk server implementing zero copy or user copy has to be CAP_SYS_ADMIN and
329+
be trusted, because it is ublk server's responsibility to make sure IO buffer
330+
filled with data for handling read command, and ublk server has to return
331+
correct result to ublk driver when handling READ command, and the result
332+
has to match with how many bytes filled to the IO buffer. Otherwise,
333+
uninitialized kernel IO buffer will be exposed to client application.
334+
335+
ublk server needs to align the parameter of `struct ublk_param_dma_align`
336+
with backend for zero copy to work correctly.
337+
338+
For reaching best IO performance, ublk server should align its segment
339+
parameter of `struct ublk_param_segment` with backend for avoiding
340+
unnecessary IO split, which usually hurts io_uring performance.
324341

325342
References
326343
==========

0 commit comments

Comments
 (0)