Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to compile against zlib-ng instead of zlib #335

Closed
wants to merge 2 commits into from

Conversation

dralley
Copy link
Contributor

@dralley dralley commented Jan 12, 2023

Zlib-ng has roughly (ymmv) twice the performance of standard zlib.

@dralley dralley force-pushed the zlib-ng branch 5 times, most recently from dcc4a53 to 9e68072 Compare January 12, 2023 20:54
@dralley dralley marked this pull request as ready for review January 12, 2023 20:55
@dralley dralley force-pushed the zlib-ng branch 2 times, most recently from decaa01 to 45a1624 Compare January 12, 2023 21:30
@dralley dralley marked this pull request as draft January 12, 2023 23:11
Zlib-ng has roughly (ymmv) twice the performance of standard zlib.
The format is of course compliant but at any given compression level
zlib-ng may use a different set of optimizations and thus the checksums
don't match.
@dralley
Copy link
Contributor Author

dralley commented Jan 13, 2023

@praiskup I used that rubygem repo from COPR with 270,000 packages to test.

The end result is a little less impressive than I was hoping. The main issue is that without --skip-stat --recycle-packagelist --update the difference gets swamped by stat calls and reading headers, and without --no-database it gets swamped by the cost of BZ2 compression, which seems to be very inefficient in comparison to even original zlib.

In the best case scenario using all 4 flags together, it does reduce time spent inside the compression library by about 40%, but that only equates to a 10% overall improvement in runtime, which might not be worthwhile relative to the maintenance burden. zstd should be faster than that and result in a 10-25% overall improvement which is easier to justify especially given the file size benefits.

But as far as COPR goes it would be a great idea to disable generating sqlite metadata everywhere except EL6 and EL7 repos because it has a huge impact. Disabling it dropped the runtime from 242 seconds to 106 seconds

DNF doesn't use it so it's safe to drop, but I think "mdapi" does still use it currently, I don't know if that's a blocker for COPR the way it is for Fedora proper https://pagure.io/releng/issue/10745

@dralley dralley closed this Jan 13, 2023
@dralley
Copy link
Contributor Author

dralley commented Jan 13, 2023

This is a profile of a --skip-stat --recycle-packagelist --update run on the aforementioned repo with original zlib and including sqlite metadata. You can see that the main issue is bz2 compression (of the sqlite metadata), specifically.

std-zlib-with-sqlite

@praiskup
Copy link
Member

Thank you for another look at performance. We use --skip-stat
--recycle-packagelist --update for sure, otherwise it would be entirely
unusable. Disabling sqlite data will help a lot, and is tracked here.
Going to zstd seems to be a long-term task (not sure how likely is to get
this to e.g. RHEL9+ at least).

@dralley
Copy link
Contributor Author

dralley commented Jan 13, 2023

It should be supported by RHEL 8+ unless it's compiled out when building libsolv. It's hard to test without having a repo first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants