Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archive/zip: io.Reader like archive/tar #10568

Closed
krolaw opened this issue Apr 24, 2015 · 7 comments
Closed

archive/zip: io.Reader like archive/tar #10568

krolaw opened this issue Apr 24, 2015 · 7 comments

Comments

@krolaw
Copy link

@krolaw krolaw commented Apr 24, 2015

I stream read zips in java and have a couple of Go projects that would benefit from this. Although zips have a footer that allows random access, files are saved sequentially, which allows for streaming as well.

@minux
Copy link
Member

@minux minux commented Apr 24, 2015

@krolaw
Copy link
Author

@krolaw krolaw commented Apr 25, 2015

Impossible? Here's a java snippet (of the critical code) from an installer I wrote years ago, which reads large zip files, and displays images from near the start of the archive while the rest of it downloads.

URLConnection connection = new URL(updateZipURL).openConnection();
BufferedInputStream in = new BufferedInputStream(new MyStream(connection.getInputStream()));
zipStream = new ZipInputStream(in);
ZipEntry entry;
while ((entry = zipStream.getNextEntry()) != null) {
// If image, add to display queue, otherwise unpack to file.
}

API for ZipInputStream here: https://docs.oracle.com/javase/7/docs/api/java/util/zip/ZipInputStream.html

To answer your first question, I want to port the installer to go. I also have another project that would benefit from this library.

@minux
Copy link
Member

@minux minux commented Apr 25, 2015

Of course it won't work in the general case.

consider this:
zip a.zip malicious_stuff
zip b.zip good_stuff
cat a.zip b.zip > c.zip
zip -A c.zip

zipinfo c.zip or unzip c.zip will only show the good_stuff, however, a streaming
uncompressor will definitely see the malicious_stuff. It may or may not see
the good_stuff.

In general, zip file can have arbitrary prefixed data at the front, how could a
streaming uncompressor deal with that?

@krolaw
Copy link
Author

@krolaw krolaw commented Apr 25, 2015

I like your example. It shows the care and thoughtfulness that goes into the standard library. The javadoc makes no mention of how false positives are caught (probably aren't) or should be handled.

To deal with arbitrary prefixed data, one could return errors from the Next function:
PreviousFilesInvalidError:
where a directory record ends, but there is still more data in the stream. This would indicate two (or more) zip files concatenated, and that all previous files were false and that more files may be coming. This error may also be used in the case where the stream is closed without a directory record, or could fire an InvalidFileFormatError or similar.

FirstNFilesInvalidError: could be used to catch cases where files were prepended without directory record. Technically this error could/should be used for the above case as well.

Alternatively/Additionally, one could have explanations/warnings of streaming zips in the docs.

Thanks.

@krolaw
Copy link
Author

@krolaw krolaw commented Apr 26, 2015

Hi Minux,

I think, given the concerns I'll write my own third party library and copy across bits from archive/zip.

Thanks for your time.

@krolaw krolaw closed this Apr 26, 2015
@minux
Copy link
Member

@minux minux commented Apr 26, 2015

@krolaw
Copy link
Author

@krolaw krolaw commented Dec 23, 2015

Finally got around to putting something together:
https://github.com/krolaw/zipstream

@golang golang locked and limited conversation to collaborators Dec 29, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.