Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make export compression useful or get rid of it entirely #4446

Closed
dralley opened this issue Sep 18, 2023 · 4 comments · Fixed by #4477
Closed

Make export compression useful or get rid of it entirely #4446

dralley opened this issue Sep 18, 2023 · 4 comments · Fixed by #4477
Labels

Comments

@dralley
Copy link
Contributor

dralley commented Sep 18, 2023

see: #4434

We could do any of the following:

  • make the compression level configurable for users who really want it, even though it accomplishes little
  • get rid of compression entirely, and therefore ditch the monkeypatches
  • try out a more suitable algorithm such as zstd or lz4, which are much faster than e.g. gzip level 1 for similar amounts of compression

This would be a forwards-only change, no backports.

@dralley dralley changed the title Make compression useful or get rid of it Make export compression useful or get rid of it Sep 18, 2023
@dralley dralley changed the title Make export compression useful or get rid of it Make export compression useful or get rid of it entirely Sep 18, 2023
@mdellweg
Copy link
Member

I'm fine with these suggestions. I only think we need to keep some versions being able to still read the current format, right?

@ipanova
Copy link
Member

ipanova commented Sep 18, 2023

Not really, import/export is only compat between Z streams

@dralley
Copy link
Contributor Author

dralley commented Sep 18, 2023

As far as I'm aware, we only allow exports to be imported on the same Y stream they were produced on. So the only rule is that we can't remove compatibility with the gzipped export files from existing releases - but we can remove it in new releases, because those new releases just need to be able to import whatever format they're currently exporting.

@dralley
Copy link
Contributor Author

dralley commented Sep 19, 2023

It appears that Gzip at all (even using level 0 as we do) still incurs some overhead during imports, on the order of 12% or so in the test I did.

Screenshot from 2023-09-18 22-44-59

It's calculating a lot of CRC32 checksums internally, even though we are verifying integrity ourselves, and also it appears that it still needs to go through some "decompression" routines to seek around within the file.

dralley added a commit to dralley/pulpcore that referenced this issue Sep 25, 2023
dralley added a commit to dralley/pulpcore that referenced this issue Sep 25, 2023
dralley added a commit to dralley/pulpcore that referenced this issue Sep 26, 2023
dralley added a commit that referenced this issue Sep 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants