ublk: add UBLK_F_BATCH_IO#288
Conversation
|
Upstream branch: dd72c8f |
aca8ada to
97c47c0
Compare
|
Upstream branch: dcb6fa3 |
3c2d287 to
0aa0345
Compare
97c47c0 to
3167cf1
Compare
|
Upstream branch: dcb6fa3 |
0aa0345 to
40df888
Compare
3167cf1 to
151715e
Compare
|
Upstream branch: dcb6fa3 |
40df888 to
134dc8e
Compare
151715e to
629ead1
Compare
|
Upstream branch: dcb6fa3 |
134dc8e to
7095700
Compare
629ead1 to
c01bcb5
Compare
|
Upstream branch: 6146a0f |
7095700 to
0cc4955
Compare
c01bcb5 to
e65f0f5
Compare
|
Upstream branch: c9cfc12 |
0cc4955 to
db1b979
Compare
e65f0f5 to
d9a4e6e
Compare
|
Upstream branch: c9cfc12 |
db1b979 to
47b457a
Compare
d9a4e6e to
4919429
Compare
|
Upstream branch: 4a0c9b3 |
47b457a to
6fcc4de
Compare
4919429 to
434ff52
Compare
batch io is designed to be independent of task context, and we will not track task context for batch io feature. So warn on non-batch-io code paths. Signed-off-by: Ming Lei <ming.lei@redhat.com>
Pass const pointer to ublk_queue_is_zoned() because it is readonly. Signed-off-by: Ming Lei <ming.lei@redhat.com>
…_IO_CMDS Add new command UBLK_U_IO_PREP_IO_CMDS, which is the batch version of UBLK_IO_FETCH_REQ. Add new command UBLK_U_IO_COMMIT_IO_CMDS, which is for committing io command result only, still the batch version. The new command header type is `struct ublk_batch_io`, and fixed buffer is required for these two uring_cmd. This patch doesn't actually implement these commands yet, just validates the SQE fields. Signed-off-by: Ming Lei <ming.lei@redhat.com>
This commit implements the handling of the UBLK_U_IO_PREP_IO_CMDS command, which allows userspace to prepare a batch of I/O requests. The core of this change is the `ublk_walk_cmd_buf` function, which iterates over the elements in the uring_cmd fixed buffer. For each element, it parses the I/O details, finds the corresponding `ublk_io` structure, and prepares it for future dispatch. Add per-io lock for protecting concurrent delivery and committing. Signed-off-by: Ming Lei <ming.lei@redhat.com>
Handle UBLK_U_IO_COMMIT_IO_CMDS by walking the uring_cmd fixed buffer: - read each element into one temp buffer in batch style - parse and apply each element for committing io result Signed-off-by: Ming Lei <ming.lei@redhat.com>
Add ublk io events fifo structure and prepare for supporting command batch, which will use io_uring multishot uring_cmd for fetching one batch of io commands each time. One nice feature of kfifo is to allow multiple producer vs single consumer. We just need lock the producer side, meantime the single consumer can be lockless. The producer is actually from ublk_queue_rq() or ublk_queue_rqs(), so lock contention can be eased by setting proper blk-mq nr_queues. Signed-off-by: Ming Lei <ming.lei@redhat.com>
Add infrastructure for delivering I/O commands to ublk server in batches, preparing for the upcoming UBLK_U_IO_FETCH_IO_CMDS feature. Key components: - struct ublk_batch_fcmd: Represents a batch fetch uring_cmd that will receive multiple I/O tags in a single operation, using io_uring's multishot command for efficient ublk IO delivery. - ublk_batch_dispatch(): Batch version of ublk_dispatch_req() that: * Pulls multiple request tags from the events FIFO (lock-free reader) * Prepares each I/O for delivery (including auto buffer registration) * Delivers tags to userspace via single uring_cmd notification * Handles partial failures by restoring undelivered tags to FIFO The batch approach significantly reduces notification overhead by aggregating multiple I/O completions into single uring_cmd, while maintaining the same I/O processing semantics as individual operations. Error handling ensures system consistency: if buffer selection or CQE posting fails, undelivered tags are restored to the FIFO for retry. This runs in task work context, scheduled via io_uring_cmd_complete_in_task() or called directly from ->uring_cmd(), enabling efficient batch processing without blocking the I/O submission path. Signed-off-by: Ming Lei <ming.lei@redhat.com>
Add UBLK_U_IO_FETCH_IO_CMDS command to enable efficient batch processing of I/O requests. This multishot uring_cmd allows the ublk server to fetch multiple I/O commands in a single operation, significantly reducing submission overhead compared to individual FETCH_REQ* commands. Key Design Features: 1. Multishot Operation: One UBLK_U_IO_FETCH_IO_CMDS can fetch many I/O commands, with the batch size limited by the provided buffer length. 2. Dynamic Load Balancing: Multiple fetch commands can be submitted simultaneously, but only one is active at any time. This enables efficient load distribution across multiple server task contexts. 3. Implicit State Management: The implementation uses three key variables to track state: - evts_fifo: Queue of request tags awaiting processing - fcmd_head: List of available fetch commands - active_fcmd: Currently active fetch command (NULL = none active) States are derived implicitly: - IDLE: No fetch commands available - READY: Fetch commands available, none active - ACTIVE: One fetch command processing events 4. Lockless Reader Optimization: The active fetch command can read from evts_fifo without locking (single reader guarantee), while writers (ublk_queue_rq/ublk_queue_rqs) use evts_lock protection. Implementation Details: - ublk_queue_rq() and ublk_queue_rqs() save request tags to evts_fifo - __ublk_pick_active_fcmd() selects an available fetch command when events arrive and no command is currently active - ublk_batch_dispatch() moves tags from evts_fifo to the fetch command's buffer and posts completion via io_uring_mshot_cmd_post_cqe() - State transitions are coordinated via evts_lock to maintain consistency Signed-off-by: Ming Lei <ming.lei@redhat.com>
In case of BATCH_IO, any request filled in event kfifo, they don't get chance to be dispatched any more when releasing ublk char device, so we have to abort them too. Add ublk_abort_batch_queue() for aborting this kind of requests. Signed-off-by: Ming Lei <ming.lei@redhat.com>
Add new feature UBLK_F_BATCH_IO which replaces the following two per-io commands: - UBLK_U_IO_FETCH_REQ - UBLK_U_IO_COMMIT_AND_FETCH_REQ with three per-queue batch io uring_cmd: - UBLK_U_IO_PREP_IO_CMDS - UBLK_U_IO_COMMIT_IO_CMDS - UBLK_U_IO_FETCH_IO_CMDS Then ublk can deliver batch io commands to ublk server in single multishort uring_cmd, also allows to prepare & commit multiple commands in batch style via single uring_cmd, communication cost is reduced a lot. This feature also doesn't limit task context any more for all supported commands, so any allowed uring_cmd can be issued in any task context. ublk server implementation becomes much easier. Meantime load balance becomes much easier to support with this feature. The command `UBLK_U_IO_FETCH_IO_CMDS` can be issued from multiple task contexts, so each task can adjust this command's buffer length or number of inflight commands for controlling how much load is handled by current task. Later, priority parameter will be added to command `UBLK_U_IO_FETCH_IO_CMDS` for improving load balance support. UBLK_U_IO_GET_DATA isn't supported in batch io yet, but it may be enabled in future via its batch pair. Signed-off-by: Ming Lei <ming.lei@redhat.com>
Document feature UBLK_F_BATCH_IO. Signed-off-by: Ming Lei <ming.lei@redhat.com>
The build_user_data() function packs multiple fields into a __u64 value using bit shifts. Without explicit __u64 casts before shifting, the shift operations are performed on 32-bit unsigned integers before being promoted to 64-bit, causing data loss. Specifically, when tgt_data >= 256, the expression (tgt_data << 24) shifts on a 32-bit value, truncating the upper 8 bits before promotion to __u64. Since tgt_data can be up to 16 bits (assertion allows up to 65535), values >= 256 would have their high byte lost. Add explicit __u64 casts to both op and tgt_data before shifting to ensure the shift operations happen in 64-bit space, preserving all bits of the input values. user_data_to_tgt_data() is only used by stripe.c, in which the max supported member disks are 4, so won't trigger this issue. Signed-off-by: Ming Lei <ming.lei@redhat.com>
Replace assert() with ublk_assert() since it is often triggered in daemon, and we may get nothing shown in terminal. Add ublk_assert(), so we can log something to syslog when assert() is triggered. Signed-off-by: Ming Lei <ming.lei@redhat.com>
Since UBLK_F_PER_IO_DAEMON is added, io buffer index may depend on current thread because the common way is to use per-pthread io_ring_ctx for issuing ublk uring_cmd. Add one helper for returning io buffer index, so we can hide the buffer index implementation details for target code. Signed-off-by: Ming Lei <ming.lei@redhat.com>
Add the foundational infrastructure for UBLK_F_BATCH_IO buffer management including: - Allocator utility functions for small sized per-thread allocation - Batch buffer allocation and deallocation functions - Buffer index management for commit buffers - Thread state management for batch I/O mode - Buffer size calculation based on device features This prepares the groundwork for handling batch I/O commands by establishing the buffer management layer needed for UBLK_U_IO_PREP_IO_CMDS and UBLK_U_IO_COMMIT_IO_CMDS operations. The allocator uses CPU sets for efficient per-thread buffer tracking, and commit buffers are pre-allocated with 2 buffers per thread to handle overlapping command operations. Signed-off-by: Ming Lei <ming.lei@redhat.com>
|
Upstream branch: e9a6fb0 |
Implement support for UBLK_U_IO_PREP_IO_CMDS in the batch I/O framework: - Add batch command initialization and setup functions - Implement prep command queueing with proper buffer management - Add command completion handling for prep and commit commands - Integrate batch I/O setup into thread initialization - Update CQE handling to support batch commands The implementation uses the previously established buffer management infrastructure to queue UBLK_U_IO_PREP_IO_CMDS commands. Commands are prepared in the first thread context and use commit buffers for efficient command batching. Key changes: - ublk_batch_queue_prep_io_cmds() prepares I/O command batches - ublk_batch_compl_cmd() handles batch command completions - Modified thread setup to use batch operations when enabled - Enhanced buffer index calculation for batch mode Signed-off-by: Ming Lei <ming.lei@redhat.com>
Implement UBLK_U_IO_COMMIT_IO_CMDS to enable efficient batched completion of I/O operations in the batch I/O framework. This completes the batch I/O infrastructure by adding the commit phase that notifies the kernel about completed I/O operations: Key features: - Batch multiple I/O completions into single UBLK_U_IO_COMMIT_IO_CMDS - Dynamic commit buffer allocation and management per thread - Automatic commit buffer preparation before processing events - Commit buffer submission after processing completed I/Os - Integration with existing completion workflows Implementation details: - ublk_batch_prep_commit() allocates and initializes commit buffers - ublk_batch_complete_io() adds completed I/Os to current batch - ublk_batch_commit_io_cmds() submits batched completions to kernel - Modified ublk_process_io() to handle batch commit lifecycle - Enhanced ublk_complete_io() to route to batch or legacy completion The commit buffer stores completion information (tag, result, buffer details) for multiple I/Os, then submits them all at once, significantly reducing syscall overhead compared to individual I/O completions. Signed-off-by: Ming Lei <ming.lei@redhat.com>
Add support for UBLK_U_IO_FETCH_IO_CMDS to enable efficient batch fetching of I/O commands using multishot io_uring operations. Key improvements: - Implement multishot UBLK_U_IO_FETCH_IO_CMDS for continuous command fetching - Add fetch buffer management with page-aligned, mlocked buffers - Process fetched I/O command tags from kernel-provided buffers - Integrate fetch operations with existing batch I/O infrastructure - Significantly reduce uring_cmd issuing overhead through batching The implementation uses two fetch buffers per thread with automatic requeuing to maintain continuous I/O command flow. Each fetch operation retrieves multiple command tags in a single syscall, dramatically improving performance compared to individual command fetching. Technical details: - Fetch buffers are page-aligned and mlocked for optimal performance - Uses IORING_URING_CMD_MULTISHOT for continuous operation - Automatic buffer management and requeuing on completion - Enhanced CQE handling for fetch command completions Signed-off-by: Ming Lei <ming.lei@redhat.com>
Add --batch/-b for enabling F_BATCH_IO. Add generic_14 for covering its basic function. Add stress_06 and stress_07 for covering stress test. Signed-off-by: Ming Lei <ming.lei@redhat.com>
Enable flexible thread-to-queue mapping in batch I/O mode to support arbitrary combinations of threads and queues, improving resource utilization and scalability. Key improvements: - Support N:M thread-to-queue mapping (previously limited to 1:1) - Dynamic buffer allocation based on actual queue assignment per thread - Thread-safe queue preparation with spinlock protection - Intelligent buffer index calculation for multi-queue scenarios - Enhanced validation for thread/queue combination constraints Implementation details: - Add q_thread_map matrix to track queue-to-thread assignments - Dynamic allocation of commit and fetch buffers per thread - Round-robin queue assignment algorithm for load balancing - Per-queue spinlock to prevent race conditions during prep - Updated buffer index calculation using queue position within thread This enables efficient configurations like: - Any other N:M combinations for optimal resource matching Testing: - Added test_generic_14.sh: 4 threads vs 1 queue - Added test_generic_15.sh: 1 thread vs 4 queues - Validates correctness across different mapping scenarios Signed-off-by: Ming Lei <ming.lei@redhat.com>
6e9c673 to
3eda0c2
Compare
5121c4d to
4458758
Compare
Pull request for series with
subject: ublk: add UBLK_F_BATCH_IO
version: 2
url: https://patchwork.kernel.org/project/linux-block/list/?series=1015103