Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some more performance testing for bzip2 and gzip #157

Open
MaxPower85 opened this issue Mar 21, 2018 · 4 comments
Open

Some more performance testing for bzip2 and gzip #157

MaxPower85 opened this issue Mar 21, 2018 · 4 comments

Comments

@MaxPower85
Copy link

I did some testing… comparing 7z to lbzip2 and pbzip2…

Both pbzip2 and lbzip2 seem super fast, much faster than 7z regardless of the compression level you choose for 7z… even 7z compressing bzip2 on level 1 (which should be it’s fastest) took like 25 seconds and produced a 345.2MB file (original file was about 730MB)...

lbzip2 compressed the same file to 341.1MB in about 10 seconds…
lbzip2 -z -9 -n 10

pbzip2 compressed a file to 341MB (slightly more compressed) in a similar time
pbzip2 -z -9

Using lower levels with pbzip2 & lbzip2 didn’t seem to make much sense in my opinion because even on level 1 pbzip2 and lbzip2 seemed like they were about as fast as on level 9, but the file was bigger (345.5MB for pbzip2 and 346.2MB for lbzip2)… so just use level 9 if it’s super-optimized for that level... I'm not sure would pbzip2 & lbzip2 compressing on lower levels be noticeably faster on some much older Mac, but they seemed pretty fast even in Parallels with limited number of CPU cores they can use.

On level 4 7z compressed the file to 340.6MB with bzip2 format (so, slightly better compressed than pbzip2 did on lelev 9 in about 10 seconds), but it needed about 25 seconds for that… so I don’t think it makes much sense to use 7z on level 4 for bzip2 if it’s only slightly better compressed than pbzip2 on level 9, but takes about 25 seconds instead of about 10… for higher 7z levels, where the difference in filesize would be bigger, I guess it would make sense to use 7z then.

So I think that on “normal” level you should go with pbzip2… and since it’s really fast, I think that you don’t need any lower levels for bzip2, even for older machines because how super-optimized pbzip2 seems to be… “normal compression (fast)” could be default and if someone wants highest compression regardless of speed, you could have an option “high compression (very slow)” with 7z on maybe level 7 and “extreme (ultra sow)” with 7z on it's level 9.

BTW... right now, in latest Keka beta, Keka seems to be using 7z to compress to bzip2 if you are only compressing one file, but it uses lbzip2 if you drag & drop multiple files, so compressing one file to bzip2 with Keka is slower than compressing that same file and a bunch of some other files together.

There seems to be a similar issue when compressing a single file to gzip with Keka too... Keka only seems to use pigz for compression if you compress multiple files and pigz also seems super-fast... but when compressing a single file, it uses 7z to compress to gzip and that just feels super-slow when compared to pigz compression...

7z doesn't even seem to be using multithreading when compressing to gzip...

I'll do some more testing with gzip format to see could 7z on it's highest settings compress to gzip significantly more than pigz to justify using 7z to compress to gzip on slow settings without multithreading... and maybe how would it compare to zopfli... maybe it could be used just on slowest settings as some "ultra slow" option and when using pigz compression call it "super fast".

Also... Don't forget to use pigz and pbzip2 for decompressing bzip2 and gzip files, for better performance.

@MaxPower85
Copy link
Author

Some more testing with pigz and 7z compressing to gzip format.

With pigz on it's level 9, it compressed a 729.1 MB file to 344.6 in maybe about 10 seconds... it's super fast...

But pigz also has level 11 (it skips level 10) which uses zopfli to compress... it uses multithreading too, but zopfli is not so fast, so even with multithreading it takes long... although the compression is pretty good...

On level 11, pigz (with zopfli) compressed the same file to 338.6 MB... but it took much longer to compress it.

With 7z on it's level 9 it compressed that same file to 339.1 MB... but as I mentioned earlier, 7z doesn't seem to be using multithreading for gzip compression, so that also takes much longer than pigz on level 9.

Since 7z on it's level 9 and pigz on it's level 11 (with zopfli) were slow for compressing large files, I did some additional tests with smaller files, so I wouldn't have to wait very long to measure how long it takes for the same file to get compressed...

For a 22.8 MB file, 7z on it's level 9 needed about 20 seconds to compress it to 7.8 MB... pigz on it's level 9 needed only about 1 or maybe 2 seconds to compress the same file to 8.1 MB... pigz on it's level 11 (which uses zopfli) needed about 44 seconds to compress the same file to 7.7 MB.

So... for gzip format, since compressing with pigz on level 9 is very fast, I think that should be the default option and you could call it "good compression (super fast)"... and if someone wants even better compression, after that you could maybe have "better compression (very slow)" with 7z on it's level 9 and "best compression (ultra slow)" with pigz on it's level 11.

@aonez aonez added this to the Look at milestone Mar 22, 2018
@aonez aonez self-assigned this Mar 22, 2018
@stale stale bot added the stale label Apr 27, 2018
@stale stale bot closed this as completed May 4, 2018
@ghost
Copy link

ghost commented Dec 28, 2018

Bump this.

I am (de)compressing a single file to/from gz. I saw that keka7z is used and CPU isn't much utilized. Is there any concern about not using kekapigz?

It does use kekapigz when (de)compressing multiple files.

@aonez

@aonez
Copy link
Owner

aonez commented Dec 31, 2018

@ffffwh you're right. I'll make some test, I'm not sure I can easily get pigz's progress.

@aonez aonez reopened this Dec 31, 2018
@stale stale bot removed the stale label Dec 31, 2018
@stale stale bot added the stale label Jan 23, 2019
@aonez aonez added the blessed label Jan 29, 2019
Repository owner deleted a comment from stale bot Jan 29, 2019
Repository owner deleted a comment from stale bot Jan 29, 2019
@aonez aonez removed the stale label Jan 29, 2019
@izian
Copy link

izian commented Apr 21, 2024

Just me here 5 years later to confirm on macOS 14.4 and Keka 1.3.3 that using Keka for Gzip compression of a big 1GB file uses 1 thread and takes .... forever. But if I add a "ignoreMe" file and drag both, kekapigz takes over with many threads and annihilates the task in a fraction of the time... like 1/40th of the time just because I added an extra file.

Blew my mind. I can't see any setting to make it use kekapigz all the time and multi thread compress that single file

EDIT: Ok I found out how; you simply enable the setting in preferences for "always tarball non-archiving formats"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants