-
-
Notifications
You must be signed in to change notification settings - Fork 257
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: what about the CloudFlare patches? #42
Comments
With some help from Shuxin Yang, the original author of the Cloudflare fork, I merged Intel's and Cloudflare's changes, so this means we get PCLMULQDQ support as well. The latest cloudflare changes are a bit less portable and less compartmentalized that we'd like, but we have been keeping an eye on the changes and we are experimenting with similar changes in private forks. Hopefully we'll get some of them merged pretty soon. And for the record, this fork would not exist had it not been for both Intel and Cloudflare spending time and money on contributing interesting ideas and code back to the community. But neither seem to be interested in maintaining it for cross-architecture portability and that is where zlib-ng is focused. We have nobody working on the Mac-side of things though, so any contributions would be welcome. We are in #zlib-ng on freenode if you want to chat some time. |
On Saturday June 06 2015 03:50:41 Hans Kristian Rosbach wrote:
Yes, at least one of the changes involves SSE4.2 specific code for which no fallback has been provided. I've tried to find one for the mm_crc32 intrinsic, but my gugle-foo has failed me to find a readily available code snippet.
I'll have a look, I didn't try to build the code yet. BTW, I saw you forked the zlib-bench repo: the original has been updated recently and now groups the benchmarks by compression level which is probably more useful. Cheers, |
* Replace GPL CRC with BSD CRC (zlib-ng/zlib-ng#42), for validation see https://github.com/neurolabusc/simd_crc * Westmere detection * Update configure for Westmere (InsightSoftwareConsortium/ITK#416) * use cpu_has_pclmul() to autodetect CPU hardware (InsightSoftwareConsortium/ITK#416) * remove gpl code * Improve support for compiling using Windows (https://github.com/ningfei/zlib) * Import ucm.cmake from https://github.com/ningfei/zlib * crc32_simd as separate file (#18) * atomic and SKIP_CPUID_CHECK (#18) * Remove unused code * Only allow compiler to use clmul instructions to crc_simd unit (intel/zlib#25) * Atomic does not compile on Ubuntu 14.04 * Allow "cmake -DSKIP_CPUID_CHECK=ON .." to not check SIMD CRC support on execution. * Restore Windows compilation using "nmake -f win32\Makefile.msc" * zconf.h required for Windows nmake * support crc intrinsics for ARM CPUs * Requested changes (#19)
I just did some benchmarking on the current versions of various deflate implementations, including zlib-ng. For decompression, fast (1) compression, and default (6) compression, zlib-ng wins handily. For maximum (9) compression, cloudflare-zlib had a substantial performance lead over zlib-ng. It looks like there may be further improvements from the cloudflare-zlib patches that could hopefully be integrated into zlib-ng to speed up the maximum compression level.
|
As far as I know all of the cloudflare patches that we were interested in have been merged. I think the only things that weren't merged were due to licensing (CRC SIMD). The cloudflare |
* Fix architecture macro on MSVC. Signed-off-by: Ningfei Li <ningfei.li@gmail.com> * Update '.gitignore'. Also remove 'zconf.h', it will be generated during building. * Add 'build' subdir to .gitignore * Fix architecture macro on MSVC. * Update CMakeLists.txt Clean up. Add option to set runtime (useful for MSVC). Add 'SSE4.2' and 'AVX' option. Signed-off-by: Ningfei Li <ningfei.li@gmail.com> * Define '_LARGEFILE64_SOURCE' by default in CMakeLists.txt. Also remove checking fseeko. Signed-off-by: Ningfei Li <ningfei.li@gmail.com> * Set 'CMAKE_MACOSX_RPATH' to TRUE. Signed-off-by: Ningfei Li <ningfei.li@gmail.com> * Update SSE4.2 and PCLMUL setting in CMakeLists.txt Signed-off-by: Ningfei Li <ningfei.li@gmail.com> * Add 'HAVE_HIDDEN' to CMakeLists.txt Signed-off-by: Ningfei Li <ningfei.li@gmail.com> * Fix building dll on MSVC. Signed-off-by: Ningfei Li <ningfei.li@gmail.com> * Fix zlib.pc file generation. Signed-off-by: Ningfei Li <ningfei.li@gmail.com> * Refine cmake settings. * Change cmake function to lowercase. * Fix zlib.pc * Fix crc32_pclmul_le_16 type. * Set POSITION_INDEPENDENT_CODE flag to ON by default. * Replace GPL CRC with BSD CRC (zlib-ng/zlib-ng#42), for validation see https://github.com/neurolabusc/simd_crc * Westmere detection * Update configure for Westmere (InsightSoftwareConsortium/ITK#416) * use cpu_has_pclmul() to autodetect CPU hardware (InsightSoftwareConsortium/ITK#416) * remove gpl code * Improve support for compiling using Windows (https://github.com/ningfei/zlib) * Import ucm.cmake from https://github.com/ningfei/zlib * crc32_simd as separate file (#18) * atomic and SKIP_CPUID_CHECK (#18) * Suppress some MSVC warnings. * Remove unused code * Fix ucm_set_runtime when only C is enabled for the project. * Removed configured header file. * Unify zconf.h template. * Only allow compiler to use clmul instructions to crc_simd unit (intel/zlib#25) * Atomic does not compile on Ubuntu 14.04 * Allow "cmake -DSKIP_CPUID_CHECK=ON .." to not check SIMD CRC support on execution. * Restore Windows compilation using "nmake -f win32\Makefile.msc" * Do not set SOVERSION on Cygwin. * Fix file permission. * Refine PCLMUL CMake option. * zconf.h required for Windows nmake * Do not set visibility flag on Cygwin. * Fix compiling dll resource. * Fix compiling using MinGW. SSE4.2 and PCLMUL are also supported. * Fix zconf.h for Windows. * support crc intrinsics for ARM CPUs * Add gzip -k option to minigzip (to aid benchmarks) * Support Intel and ARMv8 optimization in CMake. * Clean up. * Fix compiling on Apple M1. check_c_compiler_flag detects SSE2, SSE3, SSE42 and PCLMUL but compiling fails. Workaround fix, need further investigation. * Restore MSVC warnings. Co-authored-by: neurolabusc <rorden@sc.edu> Co-authored-by: Chris Rorden <rorden@mailbox.sc.edu>
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Hi,
I see you incorporated those of CloudFlare's changes that could be incorporated, and wonder if you could elaborate what that means.
I've just spent some time adapting a MacPorts "recipe" to build zlib with the CF performance patches that won't (= ought not) change the zlib license, which basically means I'm not applying the PCLMUL stuff which is GPL'ed (and doesn't build as is on OS X anyway).
There have been rather recent updates to the CloudFlare patches that give me over 2x gain in compressing speed in some cases, so that's much more interesting than what I've seen before.
Did you include those changes too (I'll admit I didn't check where this additional gain comes from exactly)?
Regards,
René
The text was updated successfully, but these errors were encountered: