Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gzip level 1 compression is still too high for many users of import/export #4434

Closed
dralley opened this issue Sep 15, 2023 · 1 comment
Closed

Comments

@dralley
Copy link
Contributor

dralley commented Sep 15, 2023

Even the lowest level of gzip compression, level 1, consumes an inordinate and disproportionately high amount of compute time compared to the benefit received from the compression (which is in most cases tiny, since often the data is compressed to begin with)

For example with no compression at all, an export of CentOS Stream 9 BaseOS + RHEL 9 BaseOS required 11.8gb of disk.

With Level 1 compression, it required 11.4gb. With Level 9 compression, it required 11.3gb.

For this result, on my system, 63% of the export time was spent on compressing the exports. Going from level 9 to level 1 to level 0 brought the runtime from 10.5min, to 8.5min, to 4min respectively.

Users of import/export are often dealing with data on the scale of hundreds of gigabytes or multiple terabytes, and disk space seems to be less an issue than the amount of time these exports take. It's best that we default to no compression at all, and look at making it configurable later.

Luckily "gzip level 0" does exist, and packs everything into a gzip archive without performing compression. So we can drop this change in without breaking compatibility with anything.

dralley added a commit to dralley/pulpcore that referenced this issue Sep 15, 2023
@dralley dralley added the BZ label Sep 15, 2023
@pulpbot
Copy link
Member

pulpbot commented Sep 15, 2023

dralley added a commit to dralley/pulpcore that referenced this issue Sep 18, 2023
dralley added a commit to dralley/pulpcore that referenced this issue Sep 18, 2023
dralley added a commit to dralley/pulpcore that referenced this issue Sep 18, 2023
dralley added a commit to dralley/pulpcore that referenced this issue Sep 18, 2023
dralley added a commit to dralley/pulpcore that referenced this issue Sep 18, 2023
dralley added a commit to dralley/pulpcore that referenced this issue Sep 18, 2023
dralley added a commit to dralley/pulpcore that referenced this issue Sep 18, 2023
mdellweg pushed a commit that referenced this issue Sep 19, 2023
closes #4434

(cherry picked from commit 70c46ec)
mdellweg pushed a commit that referenced this issue Sep 19, 2023
closes #4434

(cherry picked from commit 70c46ec)
dralley added a commit that referenced this issue Sep 19, 2023
closes #4434

(cherry picked from commit 70c46ec)
dralley added a commit that referenced this issue Sep 19, 2023
closes #4434

(cherry picked from commit 70c46ec)
dralley added a commit that referenced this issue Sep 19, 2023
closes #4434

(cherry picked from commit 70c46ec)
dralley added a commit that referenced this issue Sep 19, 2023
closes #4434

(cherry picked from commit 70c46ec)
Odilhao added a commit to Odilhao/forklift that referenced this issue Sep 21, 2023
Pulpcore changed its compression logic, the value that bats expects
should be update.

See pulp/pulpcore#4434 and
pulp/pulpcore#4411 for more details
Odilhao added a commit to theforeman/forklift that referenced this issue Sep 21, 2023
Pulpcore changed its compression logic, the value that bats expects
should be update.

See pulp/pulpcore#4434 and
pulp/pulpcore#4411 for more details
dralley added a commit to dralley/pulpcore that referenced this issue Oct 4, 2023
dralley added a commit that referenced this issue Oct 5, 2023
closes #4434

(cherry picked from commit 70c46ec)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

2 participants