-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: improve zstd compression #469
Conversation
827aa17
to
4c70fe7
Compare
@@ -195,19 +208,23 @@ pub fn write_conda_package<W: Write + Seek>( | |||
|
|||
let (info_paths, other_paths) = sort_paths(paths, base_path); | |||
|
|||
outer_archive.start_file(format!("pkg-{out_name}.tar.zst"), options)?; | |||
let archive_path = format!("pkg-{out_name}.tar.zst"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe call it just .tar
or/and use tempfile
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in 221873d
Looks good to me! Excellent improvement :) |
let mut tar_file = File::open(&tar_path)?; | ||
let compression_level = compression_level.to_zstd_level()?; | ||
let mut zst_encoder = zstd::Encoder::new(writer, compression_level)?; | ||
zst_encoder.multithread(num_cpus::get() as u32)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be configurable (default to this number, but users can choose lower number if they want).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added ZSTD_NUMTHREADS
variable in 6ae2197
The inspiration came from: facebook/zstd#2241
I added a doc comment to the function but it is private thus not very visible. I think we can mention this somewhere, where would be a good place? I thought about putting it under a new section as "Environment Variables" in README.md where we list the other variables as well (I think there is only CONDA_PREFIX
for now).
cc @dholth does this reflect what you are doing in |
Turns out ChatGPT is wrong: facebook/zstd#3608 (comment) The output is different from single threaded but should not be different with the number of cores... |
Hmm, there are more opinions on this topic: |
I should increase the number of threads. I think the comment is wrong about the downsides. https://github.com/conda/conda-package-handling/blob/main/src/conda_package_handling/conda_fmt.py#L24-L27 |
Not sure that the env var is the right approach for rattler (maybe for |
I added an argument to the |
This PR improves the zstd compression by:
set_pledged_src_size