Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: what about the CloudFlare patches? #42

Closed
RJVB opened this issue Jun 6, 2015 · 4 comments
Closed

Question: what about the CloudFlare patches? #42

RJVB opened this issue Jun 6, 2015 · 4 comments

Comments

@RJVB
Copy link
Contributor

RJVB commented Jun 6, 2015

Hi,

I see you incorporated those of CloudFlare's changes that could be incorporated, and wonder if you could elaborate what that means.
I've just spent some time adapting a MacPorts "recipe" to build zlib with the CF performance patches that won't (= ought not) change the zlib license, which basically means I'm not applying the PCLMUL stuff which is GPL'ed (and doesn't build as is on OS X anyway).
There have been rather recent updates to the CloudFlare patches that give me over 2x gain in compressing speed in some cases, so that's much more interesting than what I've seen before.
Did you include those changes too (I'll admit I didn't check where this additional gain comes from exactly)?

Regards,
René

@Dead2
Copy link
Member

Dead2 commented Jun 6, 2015

With some help from Shuxin Yang, the original author of the Cloudflare fork, I merged Intel's and Cloudflare's changes, so this means we get PCLMULQDQ support as well.
Making this portable to other platforms has taken a lot of work.

The latest cloudflare changes are a bit less portable and less compartmentalized that we'd like, but we have been keeping an eye on the changes and we are experimenting with similar changes in private forks. Hopefully we'll get some of them merged pretty soon.

And for the record, this fork would not exist had it not been for both Intel and Cloudflare spending time and money on contributing interesting ideas and code back to the community. But neither seem to be interested in maintaining it for cross-architecture portability and that is where zlib-ng is focused. We have nobody working on the Mac-side of things though, so any contributions would be welcome. We are in #zlib-ng on freenode if you want to chat some time.

@RJVB
Copy link
Contributor Author

RJVB commented Jun 6, 2015

On Saturday June 06 2015 03:50:41 Hans Kristian Rosbach wrote:

The latest cloudflare changes are a bit less portable and less compartmentalized that we'd like, but we have been keeping an eye on the changes and we are experimenting with similar changes in private forks. Hopefully we'll get some of them merged pretty soon.

Yes, at least one of the changes involves SSE4.2 specific code for which no fallback has been provided. I've tried to find one for the mm_crc32 intrinsic, but my gugle-foo has failed me to find a readily available code snippet.

We have been nobody working on the Mac-side of things though, so any contributions would be welcome. We are in #zlib-ng on freenode if you want to chat some time.

I'll have a look, I didn't try to build the code yet.

BTW, I saw you forked the zlib-bench repo: the original has been updated recently and now groups the benchmarks by compression level which is probably more useful.

Cheers,
Ren�

@Dead2 Dead2 added the question label Mar 17, 2017
vkrasnov pushed a commit to cloudflare/zlib that referenced this issue Feb 26, 2020
* Replace GPL CRC with BSD CRC (zlib-ng/zlib-ng#42), for validation see https://github.com/neurolabusc/simd_crc

* Westmere detection

* Update configure for Westmere (InsightSoftwareConsortium/ITK#416)

* use cpu_has_pclmul() to autodetect CPU hardware (InsightSoftwareConsortium/ITK#416)

* remove gpl code

* Improve support for compiling using Windows (https://github.com/ningfei/zlib)

* Import ucm.cmake from https://github.com/ningfei/zlib

* crc32_simd as separate file (#18)

* atomic and SKIP_CPUID_CHECK (#18)

* Remove unused code

* Only allow compiler to use clmul instructions to crc_simd unit (intel/zlib#25)

* Atomic does not compile on Ubuntu 14.04

* Allow "cmake -DSKIP_CPUID_CHECK=ON .." to not check SIMD CRC support on execution.

* Restore Windows compilation using "nmake -f win32\Makefile.msc"

* zconf.h required for Windows nmake

* support crc intrinsics for ARM CPUs

* Requested changes (#19)
@joshtriplett
Copy link
Contributor

I just did some benchmarking on the current versions of various deflate implementations, including zlib-ng. For decompression, fast (1) compression, and default (6) compression, zlib-ng wins handily. For maximum (9) compression, cloudflare-zlib had a substantial performance lead over zlib-ng. It looks like there may be further improvements from the cloudflare-zlib patches that could hopefully be integrated into zlib-ng to speed up the maximum compression level.

Decompression (958832135-byte gzipped tarball of /usr/lib, raw deflate stream only)
miniz_oxide: 10.593881744s
zlib: 10.679192106s
cloudflare_zlib: 10.496180677s
zlib-ng: 8.825157959s

Compression (2717675520-byte tarball of /usr/lib)

Compression (fast)
miniz_oxide: 1151036261 bytes (42.4%) in 25.146s
zlib: 1055195230 bytes (38.8%) in 45.764s
cloudflare_zlib: 1060244345 (39.0%) bytes in 28.709s
zlib-ng: 1312065000 bytes (48.3%) in 15.782s

Compression (default)
miniz_oxide: 960329164 bytes (35.3%) in 133.801s
zlib: 958349769 bytes (35.3%) in 119.928s
cloudflare_zlib: 977624382 (36.0%) bytes in 61.052s
zlib-ng: 984248874 bytes (36.2%) in 56.822s

Compression (maximum)
miniz_oxide: 956746618 bytes (35.2%) in 279.983s
zlib: 952789700 bytes (35.1%) in 436.421s
cloudflare_zlib: 974464871 bytes (35.9%) in 231.681s
zlib-ng: 974464871 bytes (35.9%) in 276.225s

@nmoinvaz
Copy link
Member

nmoinvaz commented Oct 21, 2020

As far as I know all of the cloudflare patches that we were interested in have been merged. I think the only things that weren't merged were due to licensing (CRC SIMD). The cloudflare longest_match algorithm also has a minimum match size of 4 bytes which hurts compression on some types of data. It makes it faster, but does not provide as good compression for those data types. Can we close this issue?

vkrasnov pushed a commit to cloudflare/zlib that referenced this issue Dec 2, 2020
* Fix architecture macro on MSVC.

Signed-off-by: Ningfei Li <ningfei.li@gmail.com>

* Update '.gitignore'.

Also remove 'zconf.h', it will be generated during building.

* Add 'build' subdir to .gitignore

* Fix architecture macro on MSVC.

* Update CMakeLists.txt

Clean up.
Add option to set runtime (useful for MSVC).
Add 'SSE4.2' and 'AVX' option.

Signed-off-by: Ningfei Li <ningfei.li@gmail.com>

* Define '_LARGEFILE64_SOURCE' by default in CMakeLists.txt.

Also remove checking fseeko.

Signed-off-by: Ningfei Li <ningfei.li@gmail.com>

* Set 'CMAKE_MACOSX_RPATH' to TRUE.

Signed-off-by: Ningfei Li <ningfei.li@gmail.com>

* Update SSE4.2 and PCLMUL setting in CMakeLists.txt

Signed-off-by: Ningfei Li <ningfei.li@gmail.com>

* Add 'HAVE_HIDDEN' to CMakeLists.txt

Signed-off-by: Ningfei Li <ningfei.li@gmail.com>

* Fix building dll on MSVC.

Signed-off-by: Ningfei Li <ningfei.li@gmail.com>

* Fix zlib.pc file generation.

Signed-off-by: Ningfei Li <ningfei.li@gmail.com>

* Refine cmake settings.

* Change cmake function to lowercase.

* Fix zlib.pc

* Fix crc32_pclmul_le_16 type.

* Set POSITION_INDEPENDENT_CODE flag to ON by default.

* Replace GPL CRC with BSD CRC (zlib-ng/zlib-ng#42), for validation see https://github.com/neurolabusc/simd_crc

* Westmere detection

* Update configure for Westmere (InsightSoftwareConsortium/ITK#416)

* use cpu_has_pclmul() to autodetect CPU hardware (InsightSoftwareConsortium/ITK#416)

* remove gpl code

* Improve support for compiling using Windows (https://github.com/ningfei/zlib)

* Import ucm.cmake from https://github.com/ningfei/zlib

* crc32_simd as separate file (#18)

* atomic and SKIP_CPUID_CHECK (#18)

* Suppress some MSVC warnings.

* Remove unused code

* Fix ucm_set_runtime when only C is enabled for the project.

* Removed configured header file.

* Unify zconf.h template.

* Only allow compiler to use clmul instructions to crc_simd unit (intel/zlib#25)

* Atomic does not compile on Ubuntu 14.04

* Allow "cmake -DSKIP_CPUID_CHECK=ON .." to not check SIMD CRC support on execution.

* Restore Windows compilation using "nmake -f win32\Makefile.msc"

* Do not set SOVERSION on Cygwin.

* Fix file permission.

* Refine PCLMUL CMake option.

* zconf.h required for Windows nmake

* Do not set visibility flag on Cygwin.

* Fix compiling dll resource.

* Fix compiling using MinGW.

SSE4.2 and PCLMUL are also supported.

* Fix zconf.h for Windows.

* support crc intrinsics for ARM CPUs

* Add gzip -k option to minigzip (to aid benchmarks)

* Support Intel and ARMv8 optimization in CMake.

* Clean up.

* Fix compiling on Apple M1.

check_c_compiler_flag detects SSE2, SSE3, SSE42 and PCLMUL but compiling
fails. Workaround fix, need further investigation.

* Restore MSVC warnings.

Co-authored-by: neurolabusc <rorden@sc.edu>
Co-authored-by: Chris Rorden <rorden@mailbox.sc.edu>
@Dead2 Dead2 closed this as completed Feb 25, 2021
@zlib-ng zlib-ng locked and limited conversation to collaborators Feb 25, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
None yet
Development

No branches or pull requests

4 participants