Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix zip file extension in cli compression #386

Closed
wants to merge 1 commit into from

Conversation

webmaster128
Copy link
Collaborator

No description provided.

@randombit
Copy link
Owner

Using the zip extension here doesn't seem right. While zlib is afaik the same compression algo used in zip, the output of --compress=zlib will not be a standard infozip file and in fact I'm not sure there are any standard Unix or Windows tools which understand 'plain' RFC 1950 zlib. Unless I'm missing something else about this?

@webmaster128
Copy link
Collaborator Author

Your right. I did not put a lot of research into that one because I thought it is a simple typo. I never saw a .zlib file.

.zip files are higher level containers, so definitely not correct here.

The .ZIP File Format Specification documents the following compression methods: Store (no compression), Shrink, Reduce (levels 1-4), Implode, Deflate, Deflate64, bzip2, LZMA (EFS), WavPack, and PPMd. The most commonly used compression method is DEFLATE, which is described in IETF RFC 1951. (Wikipedia)

But what is the difference between zlib and gzip? Isn't the first an implementation of the second?

@randombit
Copy link
Owner

It is kind of confusing really... there is zlib the library which implements deflate compression (RFC 1951). On top of deflate compression several different incompatible compression formats are defined including gzip (RFC 1952) and zlib (RFC 1950), and also IIRC .zip. Early versions of zlib (the library) only produced zlib format, which is deflate + simple header + Adler32 checksum (or I think it could do raw deflate with no headers using a special mode; more recent versions have this anyway). It is only somewhat recently (zlib 1.2.0 and higher) that zlib (the library) can handle gzip (the format). Old versions of gzip (the command line program) would call zlib (the library) to produce raw deflate bits and added the gzip headers and checksum (a CRC instead of Adler32) itself.

Because for such a long time zlib (the library) could only easily produce zlib (the format) and not gzip (the format), it is common for zlib format to show up in binary protocols even though command line support for it is minimal to non-existent. For example the only compression algorithm ever really defined for TLS was zlib, XMPP compression has RFC 1950 zlib as mandatory to implement, etc.

webmaster128 pushed a commit to webmaster128/botan that referenced this pull request Jan 2, 2016
webmaster128 pushed a commit to webmaster128/botan that referenced this pull request Jan 2, 2016
webmaster128 pushed a commit to webmaster128/botan that referenced this pull request Jan 3, 2016
webmaster128 pushed a commit to webmaster128/botan that referenced this pull request Jan 3, 2016
webmaster128 pushed a commit to webmaster128/botan that referenced this pull request Jan 3, 2016
webmaster128 pushed a commit to webmaster128/botan that referenced this pull request Jan 3, 2016
webmaster128 pushed a commit to webmaster128/botan that referenced this pull request Jan 3, 2016
@webmaster128
Copy link
Collaborator Author

Thanks for the great explanation, which helped a lot to make things clearer. I tried to put that into comments.

This leads to another question: What about lzma and xz? As far as my research goes, .xz requires LZMA2 data:

LZMA2 is an extension on top of the original LZMA. LZMA2 uses
LZMA internally, but adds support for flushing the encoder,
uncompressed chunks, eases stateful decoder implementations,
and improves support for multithreading. Thus, the plain LZMA
will not be supported in this file format.

from the .xz file format documentation at http://tukaani.org/xz/xz-file-format.txt. This is the same that Wiki says:

The .[xz] format, which can contain LZMA2 data, is documented at tukaani.org, while the .7z file format, which can contain either LZMA or LZMA2 data, is documented in the 7zformat.txt file contained in the LZMA SDK.

Assuming "lzma" runs LZMA_Compression, which is not LZMA2, this leads to invalid files.

webmaster128 pushed a commit to webmaster128/dummy-github-ref-test that referenced this pull request Jan 3, 2016
@randombit
Copy link
Owner

At least on my system /usr/include/lzma/ and liblzma come from the xz package, and what liblzma produces seems to be accepted by the xz command:

$ echo foo > foo
$ ./botan compress --type=lzma foo
$ file foo.xz
foo.xz: XZ compressed data
$ xz -d foo.xz -c
foo

Based on the comments from lzma/container.h, XZ is basically identical to LZMA2 (or is a file format built around LZMA2, shades of gzip/zlib formats built on deflate I guess). It appears LZMA1 can be done using the lzma_alone_encoder whose use is actively discouraged by the xz authors (and which isn't used by the LZMA wrapper in the library).

0xa5a5 pushed a commit to 0xa5a5/botan that referenced this pull request Nov 8, 2019
0xa5a5 pushed a commit to 0xa5a5/botan that referenced this pull request Nov 8, 2019
0xa5a5 pushed a commit to 0xa5a5/botan that referenced this pull request Nov 8, 2019
0xa5a5 pushed a commit to 0xa5a5/botan that referenced this pull request Nov 8, 2019
Issue randombit#386: Use RDRAND in OpenSSL if that engine is available.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants