ova-compose: fix slowness of hashsum generation #59

bo-gan-broadcom · 2024-04-28T23:48:03Z

Previously manifest hashsum generation will read the entire image file to memory and feed the buffer to hashlib. This can cause the python process to use excessive amount of memory and slow down hashsum generation, especially when the images are large (several GBs in size). Now, we fix it by reading the image file block by block. The block size is by default 1MB.

Previously manifest hashsum generation will read the entire image file to memory and feed the buffer to hashlib. This can cause the python process to use excessive amount of memory and slow down hashsum generation, especially when the images are large (several GBs in size). Now, we fix it by reading the image file block by block. The block size is by default 1MB. Signed-off-by: Bo Gan <bo.gan@broadcom.com>

vmwclabot · 2024-04-28T23:48:10Z

@bo-gan-broadcom, you must sign our contributor license agreement before your changes are merged. Click here to sign the agreement. If you are a VMware employee, read this for further instruction.

oliverkurth · 2024-04-29T00:23:05Z

Thank you. Do you have a comparison of the time used with and without that change?

bo-gan-broadcom · 2024-04-29T02:09:03Z

It depends on the memory size and image size. If the entire image file can be fit into memory, then there's no much difference in runtime, although the blocked read is still a little bit faster:

E.g., for a photon yaml with a ~6GB size ISO attached (and no vmdk file), I get the following runtimes: Command line:
$ time ova-compose.py -i photon-iso-install-hw14.yaml -o photon-iso-install-hw14.ovf -f ovf -m --checksum-type sha256
Before:

creating 'photon-iso-install-hw14.ovf' with format 'ovf' from 'photon-iso-install-hw14.yaml'
done.
18.93user 5.54system 0:25.20elapsed 97%CPU (0avgtext+0avgdata **6051556maxresident**)k
11550728inputs+40outputs (41major+**1510975minor**)pagefaults 0swaps

After:

creating 'photon-iso-install-hw14.ovf' with format 'ovf' from 'photon-iso-install-hw14.yaml'
done.
19.41user 2.65system 0:24.14elapsed 91%CPU (0avgtext+0avgdata **24116maxresident**)k
11218296inputs+40outputs (48major+**3490minor**)pagefaults 0swaps

However, if the size of the image is too large to fit into physical mem, then there'll be some heavy swapping going on, and performance will drop considerably. The command simply won't work if you don't have enough swap space configured. Reading the entire file into memory is really unnecessary.

oliverkurth · 2024-04-29T02:48:38Z

Thanks, the change makes sense. I just haven't noticed any performance impact, probably because I never tested it with a disk size greater than working memory. Especially since a vmdk is compressed, and I haven't tested with large iso images.

oliverkurth merged commit c4e8c5d into vmware:master Apr 29, 2024
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ova-compose: fix slowness of hashsum generation #59

ova-compose: fix slowness of hashsum generation #59

bo-gan-broadcom commented Apr 28, 2024

vmwclabot commented Apr 28, 2024

oliverkurth commented Apr 29, 2024

bo-gan-broadcom commented Apr 29, 2024 •

edited

oliverkurth commented Apr 29, 2024

ova-compose: fix slowness of hashsum generation #59

ova-compose: fix slowness of hashsum generation #59

Conversation

bo-gan-broadcom commented Apr 28, 2024

vmwclabot commented Apr 28, 2024

oliverkurth commented Apr 29, 2024

bo-gan-broadcom commented Apr 29, 2024 • edited

oliverkurth commented Apr 29, 2024

bo-gan-broadcom commented Apr 29, 2024 •

edited