New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibly integrate zlib-ng #416

Open
gdevenyi opened this Issue Jan 16, 2019 · 4 comments

Comments

Projects
None yet
4 participants
@gdevenyi
Copy link
Contributor

gdevenyi commented Jan 16, 2019

Related to #348 zlib-ng is a project attempting to modernize the zlib codebase:
https://github.com/zlib-ng/zlib-ng

@thewtex

This comment has been minimized.

Copy link
Member

thewtex commented Jan 16, 2019

Also pigz (a parallel version of zlib):

https://zlib.net/pigz/

@hjmjohnson

This comment has been minimized.

Copy link
Member

hjmjohnson commented Jan 16, 2019

I've heard good things about pigz (Chris Rorden is a big fan). I'm initially a bit concerned that the source code is only available as a tarball from 2 years ago.

It also feels like it may work well in posix environments, but I don't see how well it is supported across platforms.

@neurolabusc

This comment has been minimized.

Copy link

neurolabusc commented Jan 20, 2019

You might find the my comparison of different GZ compression strategies relevant. For that test I intentionally used a slow hard disk. With zlib you can write a compressed file direct to disk. Traditionally, with pigz we need to write the raw data to disk and then pigz loads this file and compresses it. Therefore, one needs to wonder if the parallel performance of pigz can be offset by the disk IO. The link also shows that if you have a modern version of pigz on Unix you can use a named-pipe to avoid writing the whole file to a slow disk. In my experience, pigz works fine on Windows, though I don't think you can use the named-pipe trick. The fact that it has not changed much in the last years may just be a sign it is mature. The original deflate/gz format dates back to a time when memory was limited and multiple cores was exotic. It has achieved widespread use, but innovations will really come from new standards that leverage modern computers. zstd is extremely impressive, in particular for medical datasets when it is combined with a byte-shuffling filter.

In my testing, I always found Cloudflare's zlib faster than zlib-ng. However, this may have changed or may have been due to compiler settings. Cloudflare seems stuck at an older version of zlib (1.2.8).

@gdevenyi

This comment has been minimized.

Copy link
Contributor Author

gdevenyi commented Jan 21, 2019

Discussion regarding cloudflare's version compared to zlib-ng: zlib-ng/zlib-ng#42

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment