New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Live merge over extend the active volume - fix requires downtime #188
Comments
When extending the base volume before merge, we use a dumb calculation extending the base volume by top_size + chunk_size. This allocate way too much space which is typically not needed. For active layer merge, there is no way to reduce the volume after the merge without shutting down the VM. The result is growing the active volume on every merge, until it consumes the maximum size. Fix the issue by measuring the sub-chain from top to base before the extend. This give the exact size needed to commit the top volume into the base volume, including the size required for the bitmaps that may be in the top and base volume. In the case of active layer merge, this measurement is a heuristic, since the guest can write data during the measurement, or later during the merge. We add one chunk of free space to minimize the chance of pausing a VM during merge. The only way to prevent pausing during merge is to monitor base volume block threshold during the merge. This was not possible in the past, but can be done with current libvirt, but vdsm thin provisioning code is not ready for this yet. For internal merge, measuring is exact, and there is no need to leave free space in the base volume since the top volume is read only. Because we extend volumes using the current size and capacity, always adding one chunk, the code is little ugly, reporting the required size without free space. I think this can be improved by adding different API to extend volumes to a known size. Fixes: oVirt#188 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
@nirs note that we already have a bug related to that: https://bugzilla.redhat.com/1993235 |
Snapshot growing to more than the virtual size is expected if the snapshot is full. But |
When extending the base volume before merge, we use a dumb calculation extending the base volume by top_size + chunk_size. This allocate way too much space which is typically not needed. For active layer merge, there is no way to reduce the volume after the merge without shutting down the VM. The result is growing the active volume on every merge, until it consumes the maximum size. Fix the issue by measuring the sub-chain from top to base before the extend. This give the exact size needed to commit the top volume into the base volume, including the size required for the bitmaps that may be in the top and base volume. In the case of active layer merge, this measurement is a heuristic, since the guest can write data during the measurement, or later during the merge. We add one chunk of free space to minimize the chance of pausing a VM during merge. The only way to prevent pausing during merge is to monitor base volume block threshold during the merge. This was not possible in the past, but can be done with current libvirt, but vdsm thin provisioning code is not ready for this yet. For internal merge, measuring is exact, and there is no need to leave free space in the base volume since the top volume is read only. Fixes: oVirt#188 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
When extending the base volume before merge, we use a dumb calculation extending the base volume by top_size + chunk_size. This allocate way too much space which is typically not needed. For active layer merge, there is no way to reduce the volume after the merge without shutting down the VM. The result is growing the active volume on every merge, until it consumes the maximum size. Fix the issue by measuring the sub-chain from top to base before the extend. This give the exact size needed to commit the top volume into the base volume, including the size required for the bitmaps that may be in the top and base volume. In the case of active layer merge, this measurement is a heuristic, since the guest can write data during the measurement, or later during the merge. We add one chunk of free space to minimize the chance of pausing a VM during merge. The only way to prevent pausing during merge is to monitor base volume block threshold during the merge. This was not possible in the past, but can be done with current libvirt, but vdsm thin provisioning code is not ready for this yet. For internal merge, measuring is exact, and there is no need to leave free space in the base volume since the top volume is read only. Fixes: oVirt#188 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
When extending the base volume before merge, we use a dumb calculation extending the base volume by top_size + chunk_size. This allocate way too much space which is typically not needed. For active layer merge, there is no way to reduce the volume after the merge without shutting down the VM. The result is growing the active volume on every merge, until it consumes the maximum size. Fix the issue by measuring the sub-chain from top to base before the extend. This give the exact size needed to commit the top volume into the base volume, including the size required for the bitmaps that may be in the top and base volume. In the case of active layer merge, this measurement is a heuristic, since the guest can write data during the measurement, or later during the merge. We add one chunk of free space to minimize the chance of pausing a VM during merge. The only way to prevent pausing during merge is to monitor base volume block threshold during the merge. This was not possible in the past, but can be done with current libvirt, but vdsm thin provisioning code is not ready for this yet. For internal merge, measuring is exact, and there is no need to leave free space in the base volume since the top volume is read only. Fixes: oVirt#188 Related-to: https://bugzilla.redhat.com/1993235 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Downstream bug https://bugzilla.redhat.com/1993235 was closed as not a bug, but |
When extending the base volume before merge, we use a dumb calculation extending the base volume by top_size + chunk_size. This allocate way too much space which is typically not needed. For active layer merge, there is no way to reduce the volume after the merge without shutting down the VM. The result is growing the active volume on every merge, until it consumes the maximum size. Fix the issue by measuring the sub-chain from top to base before the extend. This give the exact size needed to commit the top volume into the base volume, including the size required for the bitmaps that may be in the top and base volume. In the case of active layer merge, this measurement is a heuristic, since the guest can write data during the measurement, or later during the merge. We add one chunk of free space to minimize the chance of pausing a VM during merge. The only way to prevent pausing during merge is to monitor base volume block threshold during the merge. This was not possible in the past, but can be done with current libvirt, but vdsm thin provisioning code is not ready for this yet. For internal merge, measuring is exact, and there is no need to leave free space in the base volume since the top volume is read only. Fixes: oVirt#188 Related-to: https://bugzilla.redhat.com/1993235 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
When extending the base volume before merge, we use a dumb calculation extending the base volume by top_size + chunk_size. This allocate way too much space which is typically not needed. For active layer merge, there is no way to reduce the volume after the merge without shutting down the VM. The result is growing the active volume on every merge, until it consumes the maximum size. Fix the issue by measuring the sub-chain from top to base before the extend. This give the exact size needed to commit the top volume into the base volume, including the size required for the bitmaps that may be in the top and base volume. In the case of active layer merge, this measurement is a heuristic, since the guest can write data during the measurement, or later during the merge. We add one chunk of free space to minimize the chance of pausing a VM during merge. The only way to prevent pausing during merge is to monitor base volume block threshold during the merge. This was not possible in the past, but can be done with current libvirt, but vdsm thin provisioning code is not ready for this yet. For internal merge, measuring is exact, and there is no need to leave free space in the base volume since the top volume is read only. Fixes: #188 Related-to: https://bugzilla.redhat.com/1993235 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
When extending the base volume before merge, we use a dumb calculation extending the base volume by top_size + chunk_size. This allocate way too much space which is typically not needed. For active layer merge, there is no way to reduce the volume after the merge without shutting down the VM. The result is growing the active volume on every merge, until it consumes the maximum size. Fix the issue by measuring the sub-chain from top to base before the extend. This give the exact size needed to commit the top volume into the base volume, including the size required for the bitmaps that may be in the top and base volume. In the case of active layer merge, this measurement is a heuristic, since the guest can write data during the measurement, or later during the merge. We add one chunk of free space to minimize the chance of pausing a VM during merge. The only way to prevent pausing during merge is to monitor base volume block threshold during the merge. This was not possible in the past, but can be done with current libvirt, but vdsm thin provisioning code is not ready for this yet. For internal merge, measuring is exact, and there is no need to leave free space in the base volume since the top volume is read only. Fixes: oVirt#188 Related-to: https://bugzilla.redhat.com/1993235 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
When extending the base volume before live merge, we use
a dumb calculation:
This calculation is correct only in the most extreme and practically
impossible case, when all clusters in top do not exist in base, and base
and top have no free space.
Practically, in a typical live merge there is only a small amount of data
in top, some of the clusters in top are already in base, and base has
lot of free space, so base does not need to be extended, or need a small
extension.
After the live merge is completed, the base becomes the active image,
so we cannot reduce it to the optimal size. Because top size is at least
one chunk, this extends base by at least 2 chunks for every live merge.
Base size after live merge:
The sad result is the the active volume grow on each live merge, until
it reaches the virtual size of the disk.
To reduce the active volume to the optimal size, user need to shut down
the VM, and invoke the reduce disk action in the API, or create a snapshot
and perform cold merge, since after cold merge we reduce the volume to
optimal size.
For internal volume, user can invoke the reduce volume API, but this is
not easy to do and the user do not have any indication that there is
a problem.
This is not a new issue - the problem exist since ovirt 3.5, adding live
snapshot support. But now this issue is much more important, because we
use active layer merge during vm backup, and we increased the chunk size
to 2.5g.
How to reproduce:
Actual result:
Active volume size grows to maximum size (~33 GiB)
Expected result:
Active volume size does not change (~2.62 GiB)
The text was updated successfully, but these errors were encountered: