Skip to content

cosmicexplorer/medusa-zip

Repository files navigation

medusa-zip

High-performance parallelized implementations of common zip file operations.

See discussion in pex-tool/pex#2158.

Crimes

This crate adds some hacks to the widely-used zip crate (see the diff at https://github.com/zip-rs/zip/compare/master...cosmicexplorer:zip:merge-entries?expand=1). When the merge feature is provided to this fork of zip, two crimes are unveiled:

  1. merge_archive():
    • This will copy over the contents of another zip file into the current one without deserializing any data.
    • This enables parallelization of arbitrary zip commands, as multiple zip files can be created in parallel and then merged afterwards.
  2. finish_into_readable():
    • Creating a writable ZipWriter and then converting it into a readable ZipArchive is a very common operation when merging zip files.
    • This likely has zero performance benefit, but it is a good example of the types of investigations you can do with the zip format, especially against the well-written zip crate.

Compatibility

We mainly need compatibility with zipfile and zipimport (see pex-tool/pex#2158 (comment)). Also see the zipimport PEP. I currently believe that this program's output will work perfectly against zipfile and zipimport.

TODO

  • benchmark zip creation (vs zip crate)
  • benchmark zip merging (vs zip crate)

License

Apache v2.

About

A library/binary for parallel zip creation.

Resources

License

Stars

Watchers

Forks

Packages

No packages published