-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add conditional zstd compression support #345
Conversation
createrepo_c.spec
Outdated
@@ -12,8 +12,10 @@ | |||
|
|||
%if 0%{?rhel} | |||
%bcond_with zchunk | |||
%bcond_with zstd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlike zchunk, support for zstd compressed metadata is (I believe) already enabled in libsolv on RHEL, there are just no repos with such metadata because none of the tools support it yet. So it might make sense to likewise default to ON for RHEL >7. Or at least 9+.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I have enabled it also for rhel 8. I believe it was enabled by: https://bugzilla.redhat.com/show_bug.cgi?id=1914876 (that is since 8.4) so I have added a note about that.
I've filed fedora-modularity/libmodulemd#606, as far as I can tell createrepo_c uses its own wrappers for opening up modulemd metadata files so it shouldn't be a problem here, but we wouldn't want to produce zstd compressed module metadata and have libmodulemd not be able to read it. |
Basic testing with seems to work fine. Results are good. This is with the Fedora 37 "Everything" release repo. Before: After (zstd level 9): After (zstd level 11): Doesn't seem to be a lot of difference between 9 and 11 though the latter takes about 30% longer. I'd maybe recommend going with that unless we find some cases where there's a significant difference. |
I think it is a great time to change createrepo defaults and not only for compression. But we have to be careful that metadata are compatible with DNF from RHEL8.0 For testing it would be also good to test a large RHEL8 metadata. If I good remember they have a different order of element then Fedora metadata. As most important value I suggest compressed size, decompression time, and RAM requirement for decompression. I am mentioning RAM, because if I good remember with some setting it could create the large dictionary that requires more RAM - minimal system requirements are not small at all, but people have 0.5 or 1 GB RAM on their systems and then they complain to DNF. I think that the time required to compress files is important but it has less impact then other values. |
You mean no order :) RHELDST-15791
As far as I can tell lower level = less memory, as with higher levels the window and dictionary sizes are larger. I did use |
Ok, I have set it to level 9 for now, it seems like a good compromise between compression efficiency and speed. I did some quick tests as well (on rhel metadata with 22 000 pkgs) and it seems that zstd (with level 9) is strictly better than gz in all mentioned aspects except maybe for memory usage during decompression but only by a tiny amount (~2%). I can do a more detailed analysis when we will be changing the default from gz to zstd. |
It still looks fine, but what got updated in the last push? |
Sorry, I should explain, there was a segfault when accessing one of the errors. |
Magic numbers and file name extension of Zstandard archives are taken from RFC 8878. File extensions were consuleted with crearerepo_c <rpm-software-management/createrepo_c#345>. Decompression is implemented with rpmio, which added the support in rpm-4.14.0. Zstandard support in rpmio is checked at run-time. NOTE: Only standardized "zst" file name extension is suppored. I believe that createrepo_c won't implement nonstandard "zstd" extension <rpm-software-management/createrepo_c#345>. NOTE: Version bumped to 2.15 because this adds a new feature and enhances API. TODO: Check interoperability with crearerepo_c when it receives the support. TODO: Deprecate MODULEMD_COMPRESSION_TYPE_SENTINEL and remove it on next ABI break. This end of enum prevents from adding new enum values in natural order. E.g. if an application uses MODULEMD_COMPRESSION_TYPE_SENTINEL for whatever reason, inserting a value before it would change MODULEMD_COMPRESSION_TYPE_SENTINEL value which is an ABI break. Funilly that value is not used anywhere in libmodulemd code. Only in tests to iterate over the enum. fedora-modularity#606
59260bb
to
8c57740
Compare
src/compression_wrapper.c
Outdated
@@ -255,6 +277,8 @@ cr_compression_type(const char *name) | |||
type = CR_CW_XZ_COMPRESSION; | |||
if (!g_strcmp0(name_lower, "zck")) | |||
type = CR_CW_ZCK_COMPRESSION; | |||
if (!g_strcmp0(name_lower, "zst") || !g_strcmp0(name_lower, "zstd")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this "zst" and "zstd" choice? The compression is named "Zstandard" and "zstd". I think you are still mistaken it with "zst" file name extension. Can you remove "zst" option from here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, sorry, I went through it again, hopefully "zst" is just an extension now.
@kontura Is anything remaining blocking this? |
@dralley it should be ready. I need to find a reviewer. |
For: #82
Including
zstd
is optional but defaults toON
.I have converted PR: #251 to not use
rpmio
but instead usezstd
directly by basically inlining the rpmzstd
functions.I chose the compression level value
11
kind of arbitrarily, we may want to find a better value.