Skip to content

Releases: zlib-ng/zlib-ng

2.2.1

02 Jul 14:02
Compare
Choose a tag to compare

This is the first stable release of the 2.2.x branch
Please read the changelog for the 2.2.0 Release Candidate if you didn't already, especially if your software gives zlib-ng a custom allocator.

No bugreports came in during 2.2.0 RC testing, so the only change in 2.2.1 is a small fix for Configure that was already in the pipe:

  • Configure: Don't use zlib-ng's -Wl,--version-script in tests #1750

2.2.0 Release Candidate

19 Jun 13:56
Compare
Choose a tag to compare
Pre-release

This release contains several larger changes and optimizations. On x86-64 for example, this leads to a compression speedup of ~12% on default level.

We also have a major reorganization of memory alloc/free to always happen during init, this allows applications to potentially do the init early and be finished with the malloc system calls before it needs to process latency sensitive compression/decompression. It also ensures that zlib-ng can not fail due to memory pressure after having run the init functions successfully. We also now only do a single memory allocation deflate or inflate, ensuring we do less system calls and the allocated buffers live close together in memory.
Compression or decompression of very small buffers will now also be faster due to spending less time doing malloc/free.

The downside to this is that decompression will now always allocate the maximum required memory (~42KB total on 64-bit platforms), previously it would allocate (and potentially free) memory as needed during decompression.
It also means that applications that replace the alloc/free functions with their own can potentially have some issues (Yes I am looking at you Nginx).

Changes

Buildsystem

  • Generate CMake package configuration files #1647
  • Relocate CMake target export definitions #1657
  • Allow overwrite NATIVEFLAG value. #1662 #1684
  • Fix xsave intrinsic test for clang, and gcc 8.2 or newer, and icc #1664
  • Disable Intel Compiler diagnostic message 10441 #1666
  • Add missing checks for 64bit arm/intel with msvc compiler #1667
  • Don't export git/github-related files in tar/zip archives #1688
  • Cleanup and update NMake Makefiles #1673
  • Add more result variables to the cmake package configuration #1671
  • Fix building with NVHPC #1698
  • CMake: Replace ; by $ in generator-expression #1707
  • Bump max CMake policy version to 3.29.0 #1709
  • make darwin cross compilation possible #1714

CI/Test

  • Improve code coverage handling #1640 #1642 #1675 #1729
  • Add VPCLMULQDQ crc32 tests to Google benchmarks #1651
  • Add small compress() benchmark #1721 #1730
  • Add back-and-forth inflateCopy() test #1731
  • Enable orphaned unit tests for compare256_rle family of functions #1739
  • Fix MSAN error in test_dict #1726
  • CI workflows
    • Add dependabot for github actions #1687
    • Upgrade ilammy/msvc-dev-cmd to v1.13.0 #1665
    • Upgrade codecov/codecov-action to v4. #1676
    • Upgrade github/codeql-action from 2 to 3 #1691
    • Upgrade actions/upload-artifact from 3 to 4 #1692
    • Upgrade mymindstorm/setup-emsdk to v14. #1677
    • Update dependencies for 32-bit MinGW CI run #1711
    • Use windows-2019 for build with toolset v141 #1725
    • Fix macOS Github Actions #1720

Cleanup

  • Removing some outdated comments #1655
  • Remove obsolete TARGET_OS_MAC check #1703

Refactoring and Optimizations

  • Move C fallback functions into arch/generic [Part 1] #1630 #1631 #1658 #1668
  • Remove unneeded pointer for functable.longest_match in deflate_slow #1633
  • Improve x86 intrinsics dependency handling #1643
  • Split cpu_features.h by architecture #1644
  • Speed up crc32_[v]pclmulqdq on small strings. #1650
  • Cleanup of bi_flush() #1660
  • Split cpu features and arch functions #1685 #1696
  • Inline CHUNKCOPY and CHUNKUNROLL #1669
  • Remove update_hash and insert_string implementations from functable #1681
  • Disable dynamic function dispatching for native or arch-specific builds #1659 #1701
  • Clean up insert_match() in deflate_medium #1682
  • Prefer HAVE_ALIGNED_ALLOC when available in zng_alloc #1635
  • Rewrite deflate memory allocation #1713 #1736

ARM

  • Add test for checking if -march=native needs -mfpu=neon for 32-bit ARM. #1683
  • Override Clang x4 NEON intrinsics for Android #1694
  • Add AArch64 feature detection support for OpenBSD #1732
  • Improved Configure ACLE check #1727

Power

  • Fix regression in Power8/9 detection #1649

RVV

  • Optimized rvv slide_hash #1704
  • arch/riscv/riscv_features.c: fix uclibc build #1700
  • Disable CodeCov for RISC-V as the toolchain doesn't support generating code coverage #1679

S390x

  • Update s390x dockerfile #1716
  • IBM zSystems DFLTCC: Extend sanitizer checks #1717
  • IBM zSystems DFLTCC: Inline DLFTCC states into zlib states #1718
  • Remove unused function dfltcc_alloc_state #1728

x86

  • Fix PCLMULQDQ, AVX512VNNI and VPCLMULQDQ feature tests for Intel LLVM compiler (icx) #1672
  • Fix invalid instruction usage in Xeon Phi x200 processors #1723

Misc

  • Sync changes with zlib 1.3.1 #1654
  • Fix deflate_state alignment with MS or clang-cl compilers #1663
  • Improve Z_NULL compatibility with zlib #1736
  • .gitattributes: Enforce LF line-endings on all non-binary files #1715

2.1.7

19 Jun 13:34
Compare
Choose a tag to compare

Due to the high amount of refactoring changes in develop, I have decided to target those to a new version branch 2.2.x.
There is also a lot of fixes and minor improvements, so those will be backported and released as 2.1.7.

To work around the #1708 issue (incompatibility with applications mis-using the zlib zalloc/zfree API), #1710 is merged instead of backporting the much bigger #1713.

Backported Changes

Buildsystem

  • Generate CMake package configuration files #1647
  • Relocate CMake target export definitions #1657
  • Fix xsave intrinsic test for clang, and gcc 8.2 or newer, and icc #1664
  • Disable Intel Compiler diagnostic message 10441 #1666
  • Add missing checks for 64bit arm/intel with msvc compiler #1667
  • Don't export git/github-related files in tar/zip archives #1688
  • Add more result variables to the cmake package configuration #1671
  • Fix building with NVHPC #1698
  • CMake: Replace ; by $ in generator-expression #1707
  • Bump max CMake policy version to 3.29.0 #1709
  • make darwin cross compilation possible #1714

CI/Test

  • Improve code coverage handling #1640 #1642 #1675 #1729
  • Add VPCLMULQDQ crc32 tests to Google benchmarks #1651
  • Add small compress() benchmark #1721 #1730
  • Add back-and-forth inflateCopy() test #1731
  • Enable orphaned unit tests for compare256_rle family of functions #1739
  • Fix MSAN error in test_dict #1726
  • CI workflows
    • Add dependabot for github actions #1687
    • Upgrade ilammy/msvc-dev-cmd to v1.13.0 #1665
    • Upgrade codecov/codecov-action to v4. #1676
    • Upgrade github/codeql-action from 2 to 3 #1691
    • Upgrade actions/upload-artifact from 3 to 4 #1692
    • Upgrade mymindstorm/setup-emsdk to v14. #1677
    • Update dependencies for 32-bit MinGW CI run #1711
    • Use windows-2019 for build with toolset v141 #1725
    • Fix macOS Github Actions #1720

Cleanup

  • Removing some outdated comments #1655
  • Remove obsolete TARGET_OS_MAC check #1703

Refactoring and Optimizations

  • Remove unneeded pointer for functable.longest_match in deflate_slow #1633

ARM

  • Add test for checking if -march=native needs -mfpu=neon for 32-bit ARM. #1683
  • Override Clang x4 NEON intrinsics for Android #1694
  • Add AArch64 feature detection support for OpenBSD #1732

Power

  • Fix regression in Power8/9 detection #1649

RVV

  • arch/riscv/riscv_features.c: fix uclibc build #1700
  • Disable CodeCov for RISC-V as the toolchain doesn't support generating code coverage #1679

S390x

  • IBM zSystems DFLTCC: Extend sanitizer checks #1717
  • Update s390x dockerfile #1716

x86

  • Fix PCLMULQDQ, AVX512VNNI and VPCLMULQDQ feature tests for Intel LLVM compiler (icx) #1672
  • Fix invalid instruction usage in Xeon Phi x200 processors #1723

Misc

  • free_aligned: validate passed in pointer #1710
  • Sync changes with zlib 1.3.1 (LIT_MEM changes not included) #1654
  • Improve Z_NULL compatibility with zlib #1736
  • .gitattributes: Enforce LF line-endings on all non-binary files #1715

2.1.6

10 Jan 22:35
Compare
Choose a tag to compare

This is a stable release, with several minor improvements and one corruption fix for inflateCopy().
This release also improves on the functable implementation, and also moves its initialization to happen in deflateInit() and inflateInit(). We also have some optimizations for RVV and ARM.

Notes for packagers:

  • FAR macro has been added back to zlib-compat mode in this release, please remember to remove downstream patches that add it.
  • Please consider removing CMake INSTALL_LIB_DIR workarounds, this should not be needed since v2.0.2 (2021), but packagers seem to keep copying the workaround from each other. Please see cmake/detect-install-dirs.cmake.

Changes

  • Fix inflateCopy corruption caused by change in 2.1.4 #1628
    • This is a regression caused by a change introduced in 2.1.4

  • Functable

    • Initialize functable without TLS, using atomics #1609
    • Initialize functable early, during DeflateInit and InflateInit #1613
  • API

    • Add FAR macro to zlib-compat headers to improve compatibility #1637
  • ARM

    • Improve performance of crc32_acle on 32-bit ARM #1397
    • Add support for attribute((target(...))) to overcome limitations of -march=native #1620
    • Remove tab character in ACLE uqsub16 assembly #1627
  • RVV

    • Optimize adler32_fold_copy using RVV #1597
  • x86

    • Simplify AVX2 and AVX512 adler32_fold_copy by removing templates #1599
  • Buildsystem

    • Don't attempt ARMv6 detection on AARCH64 #1617
    • Prevent tests writing into source directory #1604
    • CMake: Fix clang-cl warnings #1591
    • CMake: Export cmake target #1601 #1611
    • CMake: Remove duplicate enable tests option #1610
    • CMake: Fix reading version information from zlib.h.in #1614
    • CMake: Check whether compiler supports -march=native or -mcpu=native #1618
    • CMake: Always run compiler feature tests without LTO #1622
    • CMake: Make sure uqsub16 check doesn't get optimized away with LTO #1619
    • CMake: Update to GoogleTest 1.12.1 #1623
      • Don't disable GoogleTest because of old CMake version #1623 #1638
    • CI: Add linter workflow for whitespace errors #1625 #1632
    • CI: Cancel outdated running CI jobs for PR or branch #1629
    • CI: Added CI instance for WITH_NATIVE_INSTRUCTIONS #1634
    • Tests: Fix buffer overflow in compare256_rle benchmark #1612
  • Misc

    • Update copyright to sync with zlib 1.3 #1615

2.1.5

27 Nov 12:31
Compare
Choose a tag to compare

This is a hotfix release, fixing an issue where certain applications would fail with a checksum error during inflate (decompression).
A few minor fixes and improvements are also included.

  • Fix bug with Z_FINISH handling with no window. #1602

    • This was detected by libgit2 unit tests Issue #1600
  • Added unit test for inflate with Z_FINISH and no window #1603

  • Fix CMake handling of CMAKE_INSTALL_INCLUDEDIR #1593

  • Fix pkgconfig support for WITH_GZFILEOP #1595 #1598

  • Github Actions update #1590

  • Readme Update #1594

2.1.4

19 Oct 16:59
Compare
Choose a tag to compare

This is a stable release, with several minor improvements and one fix for a possible buffer overrun while using inflateCopy().
Zlib-ng's zlib-compat mode is now targeting zlib 1.3 compatibility.
Of note, we have new optimizations for ARM and Risc-V RVV, and a lot of fixes and improvements to the buildsystem.

  • Fix: inflateCopy() allocate window with padding #1583

  • Pull zlib 1.3 changes #1563

  • API

    • Deprecate ZLIBNG_VER_STATUS, use ZLIBNG_VER_STATUSH #1581
  • MacOS

    • Relocatable pkg config files, @rpath/ install name on macOS #1546
  • MinGW

    • MinGW32 build fixes #1499
    • Support llvm-mingw toolchain #1570
  • ARM

    • Optimize slide_hash for ARMv6 #1538
    • Handle ARM64EC #1539
    • Remove inert check for HAVE_ACLE_FLAG in check_acle_compiler_flag #1554
    • Clean up ARM detection and allow ACLE on all ARM archs #1567
  • Loongarch

    • Initial loongarch port #1537
  • PowerPC

    • Fix building benchmarks on 32-bit PowerPC #1588
  • RVV

    • Optimize adler32 using rvv #1532
    • Optimize chunkset #1568
    • Support RVV hwcap detect at runtime #1585
  • x86

    • Move the AVX compatibility functions into a separate file #1540
    • Clean up SSE4.2 support, fixes compile issues under docker/VM #1542
  • Buildsystem

    • Improve intrinsics detection: ensure intrinsics are not optimized out #1562
    • CMake: Fix cross-compiling benchmarks and libpng #1589
    • CMake: Fix examining value of GENERATOR_IS_MULTI_CONFIG #1575
    • CMake: Fix Match CMAKE_GENERATOR_TOOLSET #1577
    • CMake: Cleanup handling of march=native #1544 #1578
    • CMake: Add CPack capability #1556 #1579
    • Configure: Remove march=native support #1555
    • Configure: use dash not bash #1561
    • Configure: Fix disabling deflate_quick and deflate_medium #1545
    • Configure: Fix distclean #1530
  • Misc

    • Added symbol versioning support for having multiple versions of functions #1396
    • Simplify deflate stream/state check #1502
    • Clean up __msan_unpoison() usage #1534
    • Fix spellings, typos and etc #1549 #1550 #1551 #1552 #1553

Thanks to all the contributors this release looks to be the best and most stable one so far. 🎉

2.1.3

29 Jun 08:28
Compare
Choose a tag to compare

This is a stable release, with several minor improvements and one fix for a possible endless loop during inflate.
The endless loop bug was detected using unpigz, and is likely a rare corner case that was exposed by pigz threading.
We also have optimizations for the upcoming Risc-V RVV instruction set.

Changes since 2.1.2:

  • Fix endless loop bug in chunkcopy_safe. #1526

  • Support using distro-supplied Gtest #1519

  • Minor code cleanup of deflate.c #1500

  • ARM

    • Improve buildsystem detection of ARM Cortex #1521
  • MacOS

    • Add support for RPath #1524
    • Add check for features.h #1527
  • PowerPC

    • Cross-compiling and little-endian fixes #1518 #1520
  • Risc-V

    • Optimize compare256 using RVV #1498
    • Optimize slide_hash using RVV #1522

2.1.2

07 Jun 19:30
Compare
Choose a tag to compare

This is the first stable release of the 2.1.x branch.
The changes since beta2 are minor, no changes to code, only buildsystem and tests.

Changes since 2.1.0-Beta2:

  • Stop using COMMAND_ECHO in ctests, it is not supported in older than CMake 3.15.
  • Add MIPS/MIPS64 CI tests.
  • Fix make distclean command with configure/Makefile.
  • Fix using configure/Makefile on architecture without a directory under arch.

Full release notes for the first 2.1 stable release:

This release contains two years of development and improvements to zlib-ng, as well as fixes and changes inherited from zlib.

The 2.1.x version series has new targeted minumum buildsystem versions, as detailed on the Wiki https://github.com/zlib-ng/zlib-ng/wiki

Buildsystem:

  • Many improvements to the CMake scripts.
  • Improved support for detecting memory alignment functions.
  • Improved support for unaligned access by letting the compiler promote code to unaligned if supported by the CPU.
  • Remove x86 cpu feature detection for TZCNT, safely fallback to BSF.
  • Enable using AVX512 intrinsics with GCC <9.

Optimizations and Enhancements:

  • Decompression is a lot faster (56% faster measured on AVX2-capable x86-64)
  • Compresson is improved for Level 9, at the cost of a little performance.
  • Compression is improved for Level 3, by switching from deflate_fast to deflate_medium.
  • Levels 3 and 4 have been reconfigured to provide a better gradual tradeoff for speed/compression between levels 2 and 5.
  • Deflate_quick (Level 1) has been improved to default to a bigger windowsize and support changing the window size like the other levels.
  • Deflate_rle has been optimized with its own compare_256 implementation.

New instruction set optimizations:

  • Adler32 implementation using AVX512, AVX512-VNNI, VMX.
  • CRC32-B implementation using VPCLMULQDQ & IBM-Z.
  • Slide hash implementation using VMX.
  • Compare256 implementations using SSE2, Neon, & POWER9.
  • Inflate chunk copying using SSSE3 & VSX.

Compatibility and Porting:

  • CRC-32 computation changes from madler/zlib. zlib-ng/zlib-ng#a6155234
  • Compatible and up-to-date with zlib 1.2.13.
  • Removed the usage of macros in zlib-ng.h, making life easier for languages that want to call the C functions without having the C preprocessor (Python, etc).

Improved support more environments:

  • Apple M1
  • vcpkg
  • Emscripten

Testing:

  • Tests have been converted to use GTest. Many new tests have also been added.
  • Gbench support has been added to easily benchmark changes to performance-critical functions.
  • Many new tests added

Misc:

  • Several pieces of core code has been restructured or rewritten.
  • Too many changes to list here, see the git commit log for the full list of changes.

Deprecations:

  • Configure no longer has the full range of tests.
  • NMake is no longer actively supported and tested, it is now community supported.
  • See the wiki for minimum build system versions and deprecations https://github.com/zlib-ng/zlib-ng/wiki

2.1.1-beta2

17 May 11:12
Compare
Choose a tag to compare
2.1.1-beta2 Pre-release
Pre-release

This is the second beta of the 2.1.x branch.
The changes since beta1 are relatively minor, and mostly buildsystem fixes, improved testing, and there is also one minor fix for zlib-compat mode.
This release also has two new optimalizations, one good improvement for deflate_rle, and one microoptimalization for AVX512 CRC32.

Changes since 2.1.0-Beta1:

  • Fix missing exported z_size_t type in zlib.h (zlib-compat mode).
  • Fix two Coverity warnings
  • Fix CMake GNUInstallDirs usage
  • Improved AVX512-VNNI compiler feature detection, for compilers with early AVX512-VNNI support (GCC8.0 etc)
  • Microptimalization for AVX512 implementation of CRC32
  • Optimized deflate_rle compression, also added related test and benchmark.
  • Add testing of file_compress/file_uncompress in minigzip/minideflate
  • Add deflate_fast to switchlevels test
  • Add emulated RISC-V to CI test workflow
  • Fix abicheck CI test was not ignoring version string
  • Fix MinGW CI test, broken by Github Actions VM image updates

2.1.0-beta1

28 Apr 09:45
Compare
Choose a tag to compare
2.1.0-beta1 Pre-release
Pre-release

This release contains two years of development and improvements to zlib-ng, as well as fixes and changes inherited from zlib.

The 2.1.x version series has new targeted minumum buildsystem versions, as detailed on the Wiki https://github.com/zlib-ng/zlib-ng/wiki

Buildsystem:

  • Many improvements to the CMake scripts.
  • Improved support for detecting memory alignment functions.
  • Improved support for unaligned access by letting the compiler promote code to unaligned if supported by the CPU.
  • Remove x86 cpu feature detection for TZCNT, safely fallback to BSF.
  • Enable using AVX512 intrinsics with GCC <9.

Optimizations and Enhancements:

  • Decompression is a lot faster (56% faster measured on AVX2-capable x86-64)
  • Compresson is improved for Level 9, at the cost of a little performance.
  • Compression is improved for Level 3, by switching from deflate_fast to deflate_medium.
  • Levels 3 and 4 have been reconfigured to provide a better gradual tradeoff for speed/compression between levels 2 and 5.
  • Deflate_quick (Level 1) has been improved to default to a bigger windowsize and support changing the window size like the other levels.

New instruction set optimizations:

  • Adler32 implementation using AVX512, AVX512-VNNI, VMX.
  • CRC32-B implementation using VPCLMULQDQ & IBM-Z.
  • Slide hash implementation using VMX.
  • Compare256 implementations using SSE2, Neon, & POWER9.
  • Inflate chunk copying using SSSE3 & VSX.

Compatibility and Porting:

  • CRC-32 computation changes from madler/zlib. zlib-ng/zlib-ng#a6155234
  • Compatible and up-to-date with zlib 1.2.13.
  • Removed the usage of macros in zlib-ng.h, making life easier for languages that want to call the C functions without having the C preprocessor (Python, etc).

Improved support more environments:

  • Apple M1
  • vcpkg
  • Emscripten

Testing:

  • Tests have been converted to use GTest. Many new tests have also been added.
  • Gbench support has been added to easily benchmark changes to performance-critical functions.

Misc:

  • Several pieces of core code has been restructured or rewritten.
  • Too many changes to list here, see the git commit log for the full list of changes.

Deprecations:

  • Configure no longer has the full range of tests.
  • NMake is no longer actively supported and tested, it is now community supported.
  • See the wiki for minimum build system versions and deprecations https://github.com/zlib-ng/zlib-ng/wiki