Commits
SelvaKumar-S/b…
Name already in use
Commits on Aug 17, 2021
-
dm kcopyd: add simple copy offload support
Introduce copy_jobs to use copy-offload, if supported by underlying devices otherwise fall back to existing method. run_copy_jobs() calls block layer copy offload API, if both source and destination request queue are same and support copy offload. On successful completion, destination regions copied count is made zero, failed regions are processed via existing method. Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
-
Add support for TP 4065a ("Simple Copy Command"), v2020.05.04 ("Ratified") For device supporting native simple copy, this implementation accepts the payload passed from the block layer and convert payload to form simple copy command and submit to the device. Set the device copy limits to queue limits. By default copy_offload is disabled. End-to-end protection is done by setting both PRINFOR and PRINFOW to 0. Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com> Signed-off-by: Javier González <javier.gonz@samsung.com> -
block: add emulation for simple copy
For the devices which does not support simple copy, copy emulation is added. Also for stacked devices, copy is performed via emulation. Copy-emulation is implemented by allocating maximum possible memory less than or equal to total copy size. The source ranges are read into memory by chaining bio for each source ranges and submitting them async and the last bio waits for completion. After data is read, it is written to the destination and the process is repeated till no source ranges left. bio_map_kern() is used to allocate bio and add pages of copy buffer to bio. As bio->bi_private and bio->bi_end_io are needed for chaining the bio and gets over-written, invalidate_kernel_vmap_range() for read is called in the caller. Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
-
block: Introduce a new ioctl for simple copy
Add new BLKCOPY ioctl that offloads copying of one or more sources ranges to a destination in the device. COPY ioctl accepts a 'copy_range' structure that contains destination (in sectors), no of sources and pointer to the array of source ranges. Each source range is represented by 'range_entry' that contains start and length of source ranges (in sectors) MAX_COPY_NR_RANGE, limits the number of entries for the IOCTL and MAX_COPY_TOTAL_LENGTH limits the total copy length, IOCTL can handle. Example code, to issue BLKCOPY: /* Sample example to copy three source-ranges [0, 8] [16, 8] [32,8] to * [64,24], on the same device */ int main(void) { int ret, fd; struct range_entry source_range[] = {{.src = 0, .len = 8}, {.src = 16, .len = 8}, {.src = 32, .len = 8},}; struct copy_range cr; cr.dest = 64; cr.nr_range = 3; cr.range_list = (__u64)&source_range; fd = open("/dev/nvme0n1", O_RDWR); if (fd < 0) return 1; ret = ioctl(fd, BLKCOPY, &cr); if (ret < 0) printf("copy failure\n"); close(fd); return ret; } Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com> -
block: copy offload support infrastructure
Introduce REQ_OP_COPY, a no-merge copy offload operation. Create bio with control information as payload and submit to the device. Larger copy operation may be divided if necessary by looking at device limits. REQ_OP_COPY(19) is a write op and takes zone_write_lock when submitted to zoned device. Native copy offload is not supported for stacked devices. Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
-
block: Introduce queue limits for copy-offload support
Add device limits as sysfs entries, - copy_offload (READ_WRITE) - max_copy_sectors (READ_ONLY) - max_copy_ranges_sectors (READ_ONLY) - max_copy_nr_ranges (READ_ONLY) copy_offload(= 0), is disabled by default. This needs to be enabled if copy-offload needs to be used. max_copy_sectors = 0, indicates the device doesn't support native copy. Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com> -
block: make bio_map_kern() non static
Make bio_map_kern() non static, so that copy offload/emulation can use it to add vmalloced memory to bio. Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
Commits on Aug 16, 2021
-
Merge branch 'for-5.15/io_uring' into for-next
* for-5.15/io_uring: io_uring: optimise io_prep_linked_timeout() io_uring: cancel not-armed linked touts separately io_uring: simplify io_prep_linked_timeout io_uring: kill REQ_F_LTIMEOUT_ACTIVE io_uring: deduplicate cancellation code io_uring: kill not necessary resubmit switch io_uring: optimise initial ltimeout refcounting io_uring: don't inflight-track linked timeouts io_uring: optimise iowq refcounting
-
Merge branch 'for-5.15/block' into for-next
* for-5.15/block: block: unexport blk_register_queue blk-cgroup: stop using seq_get_buf blk-cgroup: refactor blkcg_print_stat nvme: use bvec_virt dcssblk: use bvec_virt dasd: use bvec_virt ps3vram: use bvec_virt ubd: use bvec_virt sd: use bvec_virt bcache: use bvec_virt virtio_blk: use bvec_virt rbd: use bvec_virt squashfs: use bvec_virt dm-integrity: use bvec_virt dm-ebs: use bvec_virt dm: make EBS depend on !HIGHMEM block: use bvec_virt in bio_integrity_{process,free} bvec: add a bvec_virt helper block: ensure the bdi is freed after inode_detach_wb block: free the extended dev_t minor later -
Merge branch 'for-5.15/libata' into for-next
* for-5.15/libata: ata: sata_dwc_460ex: No need to call phy_exit() befre phy_init()
-
io_uring: optimise io_prep_linked_timeout()
Linked timeout handling during issuing is heavy, it adds extra instructions and forces to save the next linked timeout before io_issue_sqe(). Follwing the same reasoning as in refcounting patches, a request can't be freed by the time it returns from io_issue_sqe(), so now we don't need to do io_prep_linked_timeout() in advance, and it can be delayed to colder paths optimising the generic path. Also, it should also save quite a lot for requests with linked timeouts and completed inline on timeout spinlocking + hrtimer_start() + hrtimer_try_to_cancel() and so on. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/19bfc9a0d26c5c5f1e359f7650afe807ca8ef879.1628981736.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
block: unexport blk_register_queue
Not actually used in any modular code. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210816123649.601591-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
blk-cgroup: stop using seq_get_buf
seq_get_buf is a crutch that undoes all the memory safety of the seq_file interface. Use the normal seq_printf interfaces instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20210810152623.1796144-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
blk-cgroup: refactor blkcg_print_stat
Factor out a helper to deal with a single blkcg_gq to make the code a little bit easier to follow. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20210810152623.1796144-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Use bvec_virt instead of open coding it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20210804095634.460779-16-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Use bvec_virt instead of open coding it. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210804095634.460779-15-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Use bvec_virt instead of open coding it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Stefan Haberland <sth@linux.ibm.com> Link: https://lore.kernel.org/r/20210804095634.460779-14-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Use bvec_virt instead of open coding it. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210804095634.460779-13-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Use bvec_virt instead of open coding it. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com> Link: https://lore.kernel.org/r/20210804095634.460779-12-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Use bvec_virt instead of open coding it. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20210804095634.460779-11-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Use bvec_virt instead of open coding it. Note that the existing code is fine despite ignoring bv_offset as the bio is known to contain exactly one page from the page allocator per bio_vec. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20210804095634.460779-10-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Use bvec_virt instead of open coding it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Link: https://lore.kernel.org/r/20210804095634.460779-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Use bvec_virt instead of open coding it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20210804095634.460779-8-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Use bvec_virt instead of open coding it. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210804095634.460779-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Use bvec_virt instead of open coding it. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210804095634.460779-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Use bvec_virt instead of open coding it. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210804095634.460779-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
dm: make EBS depend on !HIGHMEM
__ebs_rw_bvec use page_address on the submitted bios data, and thus can't deal with highmem. Disable the target on highmem configs. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210804095634.460779-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
block: use bvec_virt in bio_integrity_{process,free}
Use the bvec_virt helper to clean up the bio integrity processing a little bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@kernel.org> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20210804095634.460779-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Add a helper to get the virtual address for a bvec. This avoids that all callers need to know about the page + offset representation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20210804095634.460779-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
block: ensure the bdi is freed after inode_detach_wb
inode_detach_wb references the "main" bdi of the inode. With the recent change to move the bdi from the request_queue to the gendisk this causes a guaranteed use after free when using certain cgroup configurations. The big itself is older through as any non-default inode reference (e.g. an open file descriptor) could have injected this use after free even before that. Fixes: 52ebea7 ("writeback: make backing_dev_info host cgroup-specific bdi_writebacks") Reported-by: Qian Cai <quic_qiancai@quicinc.com> Reported-by: syzbot <syzbot+1fb38bb7d3ce0fa3e1c4@syzkaller.appspotmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210816122614.601358-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
block: free the extended dev_t minor later
The dev_t is used as the inode hash, so we should only released it once then block device inode is gone from the inode cache. Move it to bdev_free_inode to ensure that. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210816122614.601358-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
Commits on Aug 15, 2021
-
io_uring: cancel not-armed linked touts separately
Adjust io_disarm_next(), so it can detect if there is a linked but not-yet-armed timeout and complete/cancel it separately. Will be used in the following patch. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ae228cde2c0df3d92d29d5e4852ed9fa8a2a97db.1628981736.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
io_uring: simplify io_prep_linked_timeout
The link test in io_prep_linked_timeout() is pretty bulky, replace it with a flag. It's better for normal path and linked requests, and also will be used further for request failing. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/3703770bfae8bc1ff370e43ef5767940202cab42.1628981736.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
io_uring: kill REQ_F_LTIMEOUT_ACTIVE
Instead of handling double consecutive linked timeouts through tricky flag combinations, just check the submit_state.link during timeout_prep and fail that case in advance. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/04150760b0dc739522264b8abd309409f7421a06.1628981736.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
io_uring: deduplicate cancellation code
IORING_OP_ASYNC_CANCEL and IORING_OP_LINK_TIMEOUT have enough of overlap, so extract a helper for request cancellation and use in both. Also, removes some amount of ugliness because of success_ret. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/900122b588e65b637e71bfec80a260726c6a54d6.1628981736.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>