@@ -309,18 +309,35 @@ with specified IO tag in the command data:
309309 ``UBLK_IO_COMMIT_AND_FETCH_REQ `` to the server, ublkdrv needs to copy
310310 the server buffer (pages) read to the IO request pages.
311311
312- Future development
313- ==================
314-
315312Zero copy
316313---------
317314
318- Zero copy is a generic requirement for nbd, fuse or similar drivers. A
319- problem [#xiaoguang ]_ Xiaoguang mentioned is that pages mapped to userspace
320- can't be remapped any more in kernel with existing mm interfaces. This can
321- occurs when destining direct IO to ``/dev/ublkb* ``. Also, he reported that
322- big requests (IO size >= 256 KB) may benefit a lot from zero copy.
323-
315+ ublk zero copy relies on io_uring's fixed kernel buffer, which provides
316+ two APIs: `io_buffer_register_bvec() ` and `io_buffer_unregister_bvec `.
317+
318+ ublk adds IO command of `UBLK_IO_REGISTER_IO_BUF ` to call
319+ `io_buffer_register_bvec() ` for ublk server to register client request
320+ buffer into io_uring buffer table, then ublk server can submit io_uring
321+ IOs with the registered buffer index. IO command of `UBLK_IO_UNREGISTER_IO_BUF `
322+ calls `io_buffer_unregister_bvec() ` to unregister the buffer, which is
323+ guaranteed to be live between calling `io_buffer_register_bvec() ` and
324+ `io_buffer_unregister_bvec() `. Any io_uring operation which supports this
325+ kind of kernel buffer will grab one reference of the buffer until the
326+ operation is completed.
327+
328+ ublk server implementing zero copy or user copy has to be CAP_SYS_ADMIN and
329+ be trusted, because it is ublk server's responsibility to make sure IO buffer
330+ filled with data for handling read command, and ublk server has to return
331+ correct result to ublk driver when handling READ command, and the result
332+ has to match with how many bytes filled to the IO buffer. Otherwise,
333+ uninitialized kernel IO buffer will be exposed to client application.
334+
335+ ublk server needs to align the parameter of `struct ublk_param_dma_align `
336+ with backend for zero copy to work correctly.
337+
338+ For reaching best IO performance, ublk server should align its segment
339+ parameter of `struct ublk_param_segment ` with backend for avoiding
340+ unnecessary IO split, which usually hurts io_uring performance.
324341
325342References
326343==========
0 commit comments