Comparing changes
Open a pull request
base repository: qemu/qemu
base: afa351fe36d4
head repository: qemu/qemu
compare: b52daaf2c868
- 18 commits
- 11 files changed
- 4 contributors
Commits on Jun 5, 2023
-
util/iov: Make qiov_slice() public
We want to inline qemu_iovec_init_extended() in block/io.c for padding requests, and having access to qiov_slice() is useful for this. As a public function, it is renamed to qemu_iovec_slice(). (We will need to count the number of I/O vector elements of a slice there, and then later process this slice. Without qiov_slice(), we would need to call qemu_iovec_subvec_niov(), and all further IOV-processing functions may need to skip prefixing elements to accomodate for a qiov_offset. Because qemu_iovec_subvec_niov() internally calls qiov_slice(), we can just have the block/io.c code call qiov_slice() itself, thus get the number of elements, and also create an iovec array with the superfluous prefixing elements stripped, so the following processing functions no longer need to skip them.) Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230411173418.19549-2-hreitz@redhat.com>
-
block: Collapse padded I/O vecs exceeding IOV_MAX
When processing vectored guest requests that are not aligned to the storage request alignment, we pad them by adding head and/or tail buffers for a read-modify-write cycle. The guest can submit I/O vectors up to IOV_MAX (1024) in length, but with this padding, the vector can exceed that limit. As of 4c002ce ("util/iov: make qemu_iovec_init_extended() honest"), we refuse to pad vectors beyond the limit, instead returning an error to the guest. To the guest, this appears as a random I/O error. We should not return an I/O error to the guest when it issued a perfectly valid request. Before 4c002ce, we just made the vector longer than IOV_MAX, which generally seems to work (because the guest assumes a smaller alignment than we really have, file-posix's raw_co_prw() will generally see bdrv_qiov_is_aligned() return false, and so emulate the request, so that the IOV_MAX does not matter). However, that does not seem exactly great. I see two ways to fix this problem: 1. We split such long requests into two requests. 2. We join some elements of the vector into new buffers to make it shorter. I am wary of (1), because it seems like it may have unintended side effects. (2) on the other hand seems relatively simple to implement, with hopefully few side effects, so this patch does that. To do this, the use of qemu_iovec_init_extended() in bdrv_pad_request() is effectively replaced by the new function bdrv_create_padded_qiov(), which not only wraps the request IOV with padding head/tail, but also ensures that the resulting vector will not have more than IOV_MAX elements. Putting that functionality into qemu_iovec_init_extended() is infeasible because it requires allocating a bounce buffer; doing so would require many more parameters (buffer alignment, how to initialize the buffer, and out parameters like the buffer, its length, and the original elements), which is not reasonable. Conversely, it is not difficult to move qemu_iovec_init_extended()'s functionality into bdrv_create_padded_qiov() by using public qemu_iovec_* functions, so that is what this patch does. Because bdrv_pad_request() was the only "serious" user of qemu_iovec_init_extended(), the next patch will remove the latter function, so the functionality is not implemented twice. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2141964 Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230411173418.19549-3-hreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
-
util/iov: Remove qemu_iovec_init_extended()
bdrv_pad_request() was the main user of qemu_iovec_init_extended(). HEAD^ has removed that use, so we can remove qemu_iovec_init_extended() now. The only remaining user is qemu_iovec_init_slice(), which can easily inline the small part it really needs. Note that qemu_iovec_init_extended() offered a memcpy() optimization to initialize the new I/O vector. qemu_iovec_concat_iov(), which is used to replace its functionality, does not, but calls qemu_iovec_add() for every single element. If we decide this optimization was important, we will need to re-implement it in qemu_iovec_concat_iov(), which might also benefit its pre-existing users. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230411173418.19549-4-hreitz@redhat.com>
-
Test that even vectored IO requests with 1024 vector elements that are not aligned to the device's request alignment will succeed. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230411173418.19549-5-hreitz@redhat.com>
-
parallels: Out of image offset in BAT leads to image inflation
data_end field in BDRVParallelsState is set to the biggest offset present in BAT. If this offset is outside of the image, any further write will create the cluster at this offset and/or the image will be truncated to this offset on close. This is definitely not correct. Raise an error in parallels_open() if data_end points outside the image and it is not a check (let the check to repaire the image). Set data_end to the end of the cluster with the last correct offset. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Message-Id: <20230424093147.197643-2-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
-
parallels: Fix high_off calculation in parallels_co_check()
Don't let high_off be more than the file size even if we don't fix the image. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20230424093147.197643-3-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
-
parallels: Fix image_end_offset and data_end after out-of-image check
Set data_end to the end of the last cluster inside the image. In such a way we can be sure that corrupted offsets in the BAT can't affect on the image size. If there are no allocated clusters set image_end_offset by data_end. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Message-Id: <20230424093147.197643-4-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
-
parallels: create parallels_set_bat_entry_helper() to assign BAT value
This helper will be reused in next patches during parallels_co_check rework to simplify its code. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20230424093147.197643-5-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
-
parallels: Use generic infrastructure for BAT writing in parallels_co…
…_check() BAT is written in the context of conventional operations over the image inside bdrv_co_flush() when it calls parallels_co_flush_to_os() callback. Thus we should not modify BAT array directly, but call parallels_set_bat_entry() helper and bdrv_co_flush() further on. After that there is no need to manually write BAT and track its modification. This makes code more generic and allows to split parallels_set_bat_entry() for independent pieces. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Message-Id: <20230424093147.197643-6-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
-
parallels: Move check of unclean image to a separate function
We will add more and more checks so we need a better code structure in parallels_co_check. Let each check performs in a separate loop in a separate helper. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20230424093147.197643-7-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
-
parallels: Move check of cluster outside image to a separate function
We will add more and more checks so we need a better code structure in parallels_co_check. Let each check performs in a separate loop in a separate helper. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Message-Id: <20230424093147.197643-8-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
-
parallels: Fix statistics calculation
Exclude out-of-image clusters from allocated and fragmented clusters calculation. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Message-Id: <20230424093147.197643-9-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
-
parallels: Move check of leaks to a separate function
We will add more and more checks so we need a better code structure in parallels_co_check. Let each check performs in a separate loop in a separate helper. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Message-Id: <20230424093147.197643-10-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
-
parallels: Move statistic collection to a separate function
We will add more and more checks so we need a better code structure in parallels_co_check. Let each check performs in a separate loop in a separate helper. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20230424093147.197643-11-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
-
parallels: Replace qemu_co_mutex_lock by WITH_QEMU_LOCK_GUARD
Replace the way we use mutex in parallels_co_check() for simplier and less error prone code. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Message-Id: <20230424093147.197643-12-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
-
parallels: Incorrect condition in out-of-image check
All the offsets in the BAT must be lower than the file size. Fix the check condition for correct check. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Message-Id: <20230424093147.197643-13-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
-
qcow2: add discard-no-unref option
When we for example have a sparse qcow2 image and discard: unmap is enabled, there can be a lot of fragmentation in the image after some time. Especially on VM's that do a lot of writes/deletes. This causes the qcow2 image to grow even over 110% of its virtual size, because the free gaps in the image get too small to allocate new continuous clusters. So it allocates new space at the end of the image. Disabling discard is not an option, as discard is needed to keep the incremental backup size as low as possible. Without discard, the incremental backups would become large, as qemu thinks it's just dirty blocks but it doesn't know the blocks are unneeded. So we need to avoid fragmentation but also 'empty' the unneeded blocks in the image to have a small incremental backup. In addition, we also want to send the discards further down the stack, so the underlying blocks are still discarded. Therefor we introduce a new qcow2 option "discard-no-unref". When setting this option to true, discards will no longer have the qcow2 driver relinquish cluster allocations. Other than that, the request is handled as normal: All clusters in range are marked as zero, and, if pass-discard-request is true, it is passed further down the stack. The only difference is that the now-zero clusters are preallocated instead of being unallocated. This will avoid fragmentation on the qcow2 image. Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1621 Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be> Message-Id: <20230605084523.34134-2-jean-louis@dupond.be> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
-
Merge tag 'pull-block-2023-06-05' of https://gitlab.com/hreitz/qemu i…
…nto staging Block patches - Fix padding of unaligned vectored requests to match the host alignment for vectors with 1023 or 1024 buffers - Refactor and fix bugs in parallels's image check functionality - Add an option to the qcow2 driver to retain (qcow2-level) allocations on discard requests from the guest (while still forwarding the discard to the lower level and marking the range as zero) # -----BEGIN PGP SIGNATURE----- # # iQJGBAABCAAwFiEEy2LXoO44KeRfAE00ofpA0JgBnN8FAmR+AT4SHGhyZWl0ekBy # ZWRoYXQuY29tAAoJEKH6QNCYAZzfnboQAKD6YrreZLoseomRfqOAoApSf6yOdcHk # 6kfsvzwzjosomsF1Pkzm4851vX5PyDqTdeu0iViM+pxanVO1b494q1P4VcAERqMB # iZVs68R6M0l6HV9btWFGm+ibHJf4FapdntkIdwog1ka5TIhw5oDWCVNLigjhIoRv # sM37Bgf14kC3sFTR++0HESsyU1eUP5gJjwJbPZ2IgJBmzYay0is1z5nHA/3VUswu # 8dKnGQDsv62EtlK7PK8cU2BhLOeNi6Wr3bAb6Wf2QLB5e0qRb7oAkqNx5/UcTznk # a3XMC1aiWhYvM/+DaYIpQUcIPgA8xQ1KHKeD6WjbGfLgZBqseX0aGWMByUsiY8Bo # +BPIBnUDrbiPnAKB/XLQfnzlE+s7121/JpEbB7AkZqVFRGuw8Wur4tbc2fzvy8Pw # x/uQfv3ZPi/2Lf6u7hv/TVHubXi8jucVgx3Ubu5Jeo3901S4/KOQBQ4BQ/GYIGQX # 38ijSROcEd0eQJ1mTKPEctouxjSZCghNSbrn9DfsL1V3VWqWNKKGCU3hM+RQ1SJT # 688qvnyYt8QZfTsiDSHR/GfKsufG0DkoqE7c9IhSEPohecAH8Rrc3HcLut7fuwD2 # gCFQhm68CPwwRmBjPCY6Zi1RDzeOyFBSWN31T6t0yTb4OHJ/3/cSZVBJtwwkOVbx # zwabHDNdY5Kw # =GuoL # -----END PGP SIGNATURE----- # gpg: Signature made Mon 05 Jun 2023 08:37:34 AM PDT # gpg: using RSA key CB62D7A0EE3829E45F004D34A1FA40D098019CDF # gpg: issuer "hreitz@redhat.com" # gpg: Good signature from "Hanna Reitz <hreitz@redhat.com>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: CB62 D7A0 EE38 29E4 5F00 4D34 A1FA 40D0 9801 9CDF * tag 'pull-block-2023-06-05' of https://gitlab.com/hreitz/qemu: qcow2: add discard-no-unref option parallels: Incorrect condition in out-of-image check parallels: Replace qemu_co_mutex_lock by WITH_QEMU_LOCK_GUARD parallels: Move statistic collection to a separate function parallels: Move check of leaks to a separate function parallels: Fix statistics calculation parallels: Move check of cluster outside image to a separate function parallels: Move check of unclean image to a separate function parallels: Use generic infrastructure for BAT writing in parallels_co_check() parallels: create parallels_set_bat_entry_helper() to assign BAT value parallels: Fix image_end_offset and data_end after out-of-image check parallels: Fix high_off calculation in parallels_co_check() parallels: Out of image offset in BAT leads to image inflation iotests/iov-padding: New test util/iov: Remove qemu_iovec_init_extended() block: Collapse padded I/O vecs exceeding IOV_MAX util/iov: Make qiov_slice() public Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff afa351fe36d4...b52daaf2c868