Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: archive/zip: add already compressed files #34974

Open
saracen opened this issue Oct 17, 2019 · 4 comments
Labels
Projects
Milestone

Comments

@saracen
Copy link

@saracen saracen commented Oct 17, 2019

If you have a compressed file, know its uncompressed size and crc32 checksum, it'd be nice to be able to add it to an archive as is.

I have three current use-cases for this:

  • Repackaging a zip file, removing or adding files, without incurring the associated compression overheads for files that already exist (somewhat achieving #15626)
  • Compress first and include later based on whether the compressed size was smaller than the original, without having to perform the compression twice.
  • Support concurrent compression: compress many files concurrently and then copy the already compressed files to the archive (similar to Apache Commons Compress' ParallelScatterZipCreator)

I see three different ways we could achieve this:

  1. Create a zip.CreateHeaderRaw(fh *FileHeader) (io.Writer, error) function, that uses the FileHeader's CRC32 and UncompressedSize64 fields set by the user.
  2. Use the existing CreateHeader(fh *FileHeader) (io.Writer, error) function, but have a new FileHeader field that indicates we're going to write already compressed data (and then use CRC32 and UncompressedSize64 fields set by the user)
  3. Use the existing CreateHeader(fh *FileHeader) (io.Writer, error) function, but if CompressedSize64 has already been set, assume that data written is already compressed.

I'm going to assume that option 3 would be a no-go, because existing code might suddenly break if a user has already set the CompressedSize for whatever reason, but hopefully the other options are viable.

@julieqiu

This comment has been minimized.

Copy link

@julieqiu julieqiu commented Oct 18, 2019

/cc @dsnet

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Oct 19, 2019

Change https://golang.org/cl/202217 mentions this issue: archive/zip: support adding raw files

@saracen

This comment has been minimized.

Copy link
Author

@saracen saracen commented Oct 19, 2019

Added change that uses option 1 (CreateHeaderRaw) because it feels weird to use a field in FileHeader solely for archiving as that structure is also used when reading an archive.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Oct 19, 2019

This is going to be new API, so turning into a proposal.

@dsnet Please add any thoughts you may have about this functionality. Thanks.

@ianlancetaylor ianlancetaylor changed the title archive/zip: add already compressed files proposal: archive/zip: add already compressed files Oct 19, 2019
@gopherbot gopherbot added this to the Proposal milestone Oct 19, 2019
@gopherbot gopherbot added the Proposal label Oct 19, 2019
@rsc rsc added this to Incoming in Proposals Dec 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Proposals
Incoming
4 participants
You can’t perform that action at this time.