Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Converting qcow2 images to raw is too slow #2579

Closed
nirs opened this issue Sep 1, 2024 · 11 comments · Fixed by #2798
Closed

Converting qcow2 images to raw is too slow #2579

nirs opened this issue Sep 1, 2024 · 11 comments · Fixed by #2798

Comments

@nirs
Copy link
Contributor

nirs commented Sep 1, 2024

Description

Based on the logs, converting ubuntu server cloud image (xxx MiB) to raw format takes 17 seconds. The same operation using qemu-img convert takes 1.8 seconds.

Example log:

time="2024-09-02T01:43:25+03:00" level=info msg="Converting \"/Users/nsoffer/.lima/cluster/basedisk\" (qcow2) to a raw disk \"/Users/nsoffer/.lima/cluster/diffdisk\""
...
time="2024-09-02T01:43:42+03:00" level=info msg="Expanding to 20GiB"

Same with qemu-img

% time qemu-img convert -f qcow2 -O raw ~/.lima/cluster/basedisk diffdisk
qemu-img convert -f qcow2 -O raw ~/.lima/cluster/basedisk diffdisk  2.37s user 1.90s system 241% cpu 1.768 total

Lima has nice progress bar during the slow convert, but qemu-img is fast enough so no progress bar is needed. It has also a progress bar option that can be used to extract progress if needed.

Fix:

  • use qemu-img convert if available
  • use -p to show progress
@jandubois
Copy link
Member

Fix:

  • use qemu-img convert if available

It would be better to fix the speed of the builtin conversion so it will be fast even when QEMU is not installed. Given that the default emulation in Lima 1.0 will be VZ, qemu will be an optional dependency.

@nirs
Copy link
Contributor Author

nirs commented Sep 2, 2024

I don't think that reinventing qemu-img good direction. The time spent on it can be spent on features that that add values to users. qemu-img is efficient, supports all images formats, well maintained, and available everywhere.

@afbjorklund
Copy link
Member

You can default to qemu-img (where available), and then fallback to the library as a slower fallback option?

We have used this trick elsewhere, like with SFTP or with XZ. The downside is having two code paths to test...

@AkihiroSuda
Copy link
Member

qemu-img is efficient, supports all images formats, well maintained, and available everywhere.

On macOS, it is hard to install qemu-img when Homebrew/MacPorts/nix is disallowed due to employers' policy

@AkihiroSuda
Copy link
Member

@jandubois
Copy link
Member

jandubois commented Oct 9, 2024

I just found out that the built-in conversion needs more diskspace than qemu-img convert. While the end-result is still a sparse disk, it seems to require the full 100GB disk space temporarily, so you cannot convert from QCOW2 to RAW on a device with limited free space.

$ df -h ~/.lima3
Filesystem    Size    Used   Avail Capacity iused ifree %iused  Mounted on
/dev/disk5    50Gi   692Mi    49Gi     2%      11  4.3G    0%   /Users/jan/.lima3

$ l start --vm-type vz
? Creating an instance "default" Proceed with the current configuration
INFO[0001] Starting the instance "default" with VM driver "vz"

INFO[0002] Converting "/Users/jan/.lima3/default/basedisk" (qcow2) to a raw disk "/Users/jan/.lima3/default/diffdisk"
3.50 GiB / 3.50 GiB [-------------------------------------] 100.00% 206.87 MiB/s
INFO[0019] Expanding to 100GiB
FATA[0020] failed to convert "/Users/jan/.lima3/default/basedisk" to a raw disk "/Users/jan/.lima3/default/diffdisk": no space left on device

Using qemu-img convert seems to require little extra space beyond what the new sparse file actually occupies.

@jandubois
Copy link
Member

While the end-result is still a sparse disk

Actually, it is not, with the builtin conversion. It turns into a fully allocated disk. So this is even worse. That also might explain why it takes so long: it possibly writes the full 100GB to disk.

@AkihiroSuda
Copy link
Member

The non-sparse issue is being fixed in:

@nirs
Copy link
Contributor Author

nirs commented Oct 13, 2024

I think the simplest way to fix it is to convert the image to raw after the download. There is no reason to keep qcow2 files in the cache when we use the file as a base disk, even when using qemu.

We can try to optimize qcow2 convert later to make the initial download faster.

New flow:

  1. download the image in whatever format (raw, qcow2, raw compressed)
  2. verify the checksum
  3. convert to uncompressed raw file

When creating a vm we can always do fast copy on the raw image from the cache.

Questions:

  • do we use the stored checksum of the qcow2 image after the download?
  • do we need a checksum of the raw file?

Issues:

  • will not help the case when user create qcow2 disk and try to attach them to vz based instance

Testing shows that this makes limactl start almost 3 times faster:

Starting from qcow2 image

% cat test-qcow2.yaml 
images:
- location: "https://cloud-images.ubuntu.com/releases/24.04/release/ubuntu-24.04-server-cloudimg-arm64.img"
  arch: "aarch64"
vmType: vz
plain: true

% time limactl start --tty=false test-qcow2.yaml
INFO[0000] Terminal is not available, proceeding without opening an editor 
INFO[0000] Starting the instance "test-qcow2" with VM driver "vz" 
INFO[0000] Attempting to download the image              arch=aarch64 digest= location="https://cloud-images.ubuntu.com/releases/24.04/release/ubuntu-24.04-server-cloudimg-arm64.img"
INFO[0000] Using cache "/Users/nsoffer/Library/Caches/lima/download/by-url-sha256/002fbe468673695a2206b26723b1a077a71629001a5b94efd8ea1580e1c3dd06/data" 
INFO[0000] Converting "/Users/nsoffer/.lima/test-qcow2/basedisk" (qcow2) to a raw disk "/Users/nsoffer/.lima/test-qcow2/diffdisk" 
3.50 GiB / 3.50 GiB [-------------------------------------] 100.00% 201.56 MiB/s
INFO[0018] Expanding to 100GiB                          
WARN[0018] [hostagent] GRPC port forwarding is experimental 
INFO[0018] [hostagent] hostagent socket created at /Users/nsoffer/.lima/test-qcow2/ha.sock 
INFO[0018] [hostagent] Starting VZ (hint: to watch the boot progress, see "/Users/nsoffer/.lima/test-qcow2/serial*.log") 
INFO[0018] [hostagent] new connection from  to          
INFO[0019] SSH Local Port: 59529                        
INFO[0018] [hostagent] [VZ] - vm state change: running  
INFO[0018] [hostagent] Running in plain mode. Mounts, port forwarding, containerd, etc. will be ignored. Guest agent will not be running. 
INFO[0018] [hostagent] Waiting for the essential requirement 1 of 1: "ssh" 
INFO[0028] [hostagent] Waiting for the essential requirement 1 of 1: "ssh" 
INFO[0028] [hostagent] The essential requirement 1 of 1 is satisfied 
INFO[0028] [hostagent] Waiting for the final requirement 1 of 1: "boot scripts must have finished" 
INFO[0028] [hostagent] The final requirement 1 of 1 is satisfied 
INFO[0029] READY. Run `ssh -F "/Users/nsoffer/.lima/test-qcow2/ssh.config" lima-test-qcow2` to open the shell.
limactl start --tty=false test-qcow2.yaml  19.99s user 1.53s system 71% cpu 29.911 total

Starting from raw image

% cat test-raw.yaml  
images:
- location: "/Users/nsoffer/vms/ubuntu-24.04-server-cloudimg-arm64.img"
  arch: "aarch64"
vmType: vz
plain: true

% time limactl start --tty=false test-raw.yaml  
INFO[0000] Terminal is not available, proceeding without opening an editor 
INFO[0000] Starting the instance "test-raw" with VM driver "vz" 
INFO[0000] Attempting to download the image              arch=aarch64 digest= location=/Users/nsoffer/vms/ubuntu-24.04-server-cloudimg-arm64.img
INFO[0000] Downloaded the image from "/Users/nsoffer/vms/ubuntu-24.04-server-cloudimg-arm64.img" 
INFO[0000] Converting "/Users/nsoffer/.lima/test-raw/basedisk" (raw) to a raw disk "/Users/nsoffer/.lima/test-raw/diffdisk" 
INFO[0000] Expanding to 100GiB                          
WARN[0000] [hostagent] GRPC port forwarding is experimental 
INFO[0000] [hostagent] hostagent socket created at /Users/nsoffer/.lima/test-raw/ha.sock 
INFO[0000] [hostagent] Starting VZ (hint: to watch the boot progress, see "/Users/nsoffer/.lima/test-raw/serial*.log") 
INFO[0000] [hostagent] new connection from  to          
INFO[0000] SSH Local Port: 59539                        
INFO[0000] [hostagent] [VZ] - vm state change: running  
INFO[0000] [hostagent] Running in plain mode. Mounts, port forwarding, containerd, etc. will be ignored. Guest agent will not be running. 
INFO[0000] [hostagent] Waiting for the essential requirement 1 of 1: "ssh" 
INFO[0010] [hostagent] Waiting for the essential requirement 1 of 1: "ssh" 
INFO[0010] [hostagent] The essential requirement 1 of 1 is satisfied 
INFO[0010] [hostagent] Waiting for the final requirement 1 of 1: "boot scripts must have finished" 
INFO[0010] [hostagent] The final requirement 1 of 1 is satisfied 
INFO[0011] READY. Run `ssh -F "/Users/nsoffer/.lima/test-raw/ssh.config" lima-test-raw` to open the shell. 
limactl start --tty=false test-raw.yaml  0.03s user 0.08s system 0% cpu 11.371 total

@nirs
Copy link
Contributor Author

nirs commented Oct 15, 2024

Converting the compressed qcow2 is 1.6 times faster with lima-vm/go-qcow2reader#31 but matching qemu-img requires much more work.

@nirs
Copy link
Contributor Author

nirs commented Oct 25, 2024

Converting once at the end of the download is better, but with improve go-qcow2reader this save only 2 seconds for the default image, so it is lower priority. I'll open a new issue for this to consider in future version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants