@Cyan4973 Cyan4973 released this Sep 11, 2018

Assets 4

This is maintenance release, mainly triggered by issue #560.
#560 is a data corruption that can only occur in v1.8.2, at level 9 (only), for some "large enough" data blocks (> 64 KB), featuring a fairly specific data pattern, improbable enough that multiple cpu running various fuzzers non-stop during a period of several weeks where not able to find it. Big thanks to @Pashugan for finding and sharing a reproducible sample.

Due to this fix, v1.8.3 is a recommended update.

A few other minor features were already merged, and are therefore bundled in this release too.

Should lz4 prove too slow, it's now possible to invoke --fast=# command, by @jennifermliu . This is equivalent to the acceleration parameter in the API, in which user forfeit some compression ratio for the benefit of better speed.

The verbose CLI has been fixed, and now displays the real amount of time spent compressing (instead of cpu time). It also shows a new indicator, cpu load %, so that users can determine if the limiting factor was cpu or I/O bandwidth.

Finally, an existing function, LZ4_decompress_safe_partial(), has been enhanced to make it possible to decompress only the beginning of an LZ4 block, up to a specified number of bytes. Partial decoding can be useful to save CPU time and memory, when the objective is to extract a limited portion from a larger block.

@Cyan4973 Cyan4973 released this May 7, 2018

Assets 4

LZ4 v1.8.2 is a performance focused release, featuring important improvements for small inputs, especially when coupled with dictionary compression.

General speed improvements

LZ4 decompression speed has always been a strong point. In v1.8.2, this gets even better, as it improves decompression speed by about 10%, thanks in a large part to suggestion from @svpv .

For example, on a Mac OS-X laptop with an Intel Core i7-5557U CPU @ 3.10GHz,
running lz4 -bsilesia.tar compiled with default compiler llvm v9.1.0:

Version v1.8.1 v1.8.2 Improvement
Decompression speed 2490 MB/s 2770 MB/s +11%

Compression speeds also receive a welcomed boost, though improvement is not evenly distributed, with higher levels benefiting quite a lot more.

Version v1.8.1 v1.8.2 Improvement
lz4 -1 504 MB/s 516 MB/s +2%
lz4 -9 23.2 MB/s 25.6 MB/s +10%
lz4 -12 3.5 Mb/s 9.5 MB/s +170%

Should you aim for best possible decompression speed, it's possible to request LZ4 to actively favor decompression speed, even if it means sacrificing some compression ratio in the process. This can be requested in a variety of ways depending on interface, such as using command --favor-decSpeed on CLI. This option must be combined with ultra compression mode (levels 10+), as it needs careful weighting of multiple solutions, which only this mode can process.
The resulting compressed object always decompresses faster, but is also larger. Your mileage will vary, depending on file content. Speed improvement can be as low as 1%, and as high as 40%. It's matched by a corresponding file size increase, which tends to be proportional. The general expectation is 10-20% faster decompression speed for 1-2% bigger files.

Filename decompression speed --favor-decSpeed Speed Improvement Size change
silesia.tar 2870 MB/s 3070 MB/s +7 % +1.45%
dickens 2390 MB/s 2450 MB/s +2 % +0.21%
nci 3740 MB/s 4250 MB/s +13 % +1.93%
osdb 3140 MB/s 4020 MB/s +28 % +4.04%
xml 3770 MB/s 4380 MB/s +16 % +2.74%

Finally, variant LZ4_compress_destSize() also receives a ~10% speed boost, since it now internally redirects toward primary internal implementation of LZ4 fast mode, rather than relying on a separate custom implementation. This allows it to take advantage of all the optimization work that has gone into the main implementation.

Compressing small contents

When compressing small inputs, the fixed cost of clearing the compression's internal data structures can become a significant fraction of the compression cost. This release adds a new way, under certain conditions, to perform this initialization at effectively zero cost.

New, experimental LZ4 APIs have been introduced to take advantage of this functionality in block mode:

  • LZ4_resetStream_fast()
  • LZ4_compress_fast_extState_fastReset()
  • LZ4_resetStreamHC_fast()
  • LZ4_compress_HC_extStateHC_fastReset()

More detail about how and when to use these functions is provided in their respective headers.

LZ4 Frame mode has been modified to use this faster reset whenever possible. LZ4F_compressFrame_usingCDict() prototype has been modified to additionally take an LZ4F_CCtx* context, so it can use this speed-up.

Efficient Dictionary compression

Support for dictionaries has been improved in a similar way: they can now be used in-place, which avoids the expense of copying the context state from the dictionary into the working context. Users are expect to see a noticeable performance improvement for small data.

Experimental prototypes (LZ4_attach_dictionary() and LZ4_attach_HC_dictionary()) have been added to LZ4 block API using a loaded dictionary in-place. LZ4 Frame API users should benefit from this optimization transparently.

The previous two changes, when taken advantage of, can provide meaningful performance improvements when compressing small data. Both changes have no impact on the produced compressed data. The only observable difference is speed.

Linux git compression ratio vs speed

This is a representative graphic of the sort of speed boost to expect. The red lines are the speeds seen for an input blob of the specified size, using the previous LZ4 release (v1.8.1) at compression levels 1 and 9 (those being, fast mode and default HC level). The green lines are the equivalent observations for v1.8.2. This benchmark was performed on the Silesia Corpus. Results for the dickens text are shown, other texts and compression levels saw similar improvements. The benchmark was compiled with GCC 7.2.0 with -O3 -march=native -mtune=native -DNDEBUG under Linux 4.6 and run on an Intel Xeon CPU E5-2680 v4 @ 2.40GHz.

lz4frame_static.h Deprecation

The content of lz4frame_static.h has been folded into lz4frame.h, hidden by a macro guard "#ifdef LZ4F_STATIC_LINKING_ONLY". This means lz4frame.h now matches lz4.h and lz4hc.h. lz4frame_static.h is retained as a shell that simply sets the guard macro and includes lz4frame.h.

Changes list

This release also brings an assortment of small improvements and bug fixes, as detailed below :

  • perf: faster compression on small files, by @felixhandte
  • perf: improved decompression speed and binary size, by Alexey Tourbin (@svpv)
  • perf: faster HC compression, especially at max level
  • perf: very small compression ratio improvement
  • fix : compression compatible with low memory addresses (< 0xFFFF)
  • fix : decompression segfault when provided with NULL input, by @terrelln
  • cli : new command --favor-decSpeed
  • cli : benchmark mode more accurate for small inputs
  • fullbench : can bench _destSize() variants, by @felixhandte
  • doc : clarified block format parsing restrictions, by Alexey Tourbin (@svpv)

@Cyan4973 Cyan4973 released this Jan 14, 2018

Assets 4

LZ4 v1.8.1 most visible new feature is its support for Dictionary compression .
This was already somewhat possible, but in a complex way, requiring knowledge of internal working.
Support is now more formally added on the API side within lib/lz4frame_static.h. It's early days, and this new API is tagged "experimental" for the time being.

Support is also added in the command line utility lz4, using the new command -D, implemented by @felixhandte. The behavior of this command is identical to zstd, should you be already familiar.

lz4 doesn't specify how to build a dictionary. All it says is that it can be any file up to 64 KB.
This approach is compatible with zstd dictionary builder, which can be instructed to create a 64 KB dictionary with this command :

zstd --train dirSamples/* -o dictName --maxdict=64KB

LZ4 v1.8.1 also offers improved performance at ultra settings (levels 10+).
These levels receive a new code, called optimal parser, available in lib/lz4_opt.h.
Compared with previous version, the new parser uses less memory (from 384KB to 256KB), performs faster, compresses a little bit better (not much, as it was already close to theoretical limit), and resists pathological patterns which could destroy performance (see #339),

For comparison, here are some quick benchmark using LZ4 v1.8.0 on my laptop with silesia.tar :

./lz4 -b9e12 -v ~/dev/bench/silesia.tar
*** LZ4 command line interface 64-bits v1.8.0, by Yann Collet ***
Benchmarking levels from 9 to 12
 9#silesia.tar       : 211984896 ->  77897777 (2.721),  24.2 MB/s ,2401.8 MB/s
10#silesia.tar       : 211984896 ->  77852187 (2.723),  16.9 MB/s ,2413.7 MB/s
11#silesia.tar       : 211984896 ->  77435086 (2.738),   7.1 MB/s ,2425.7 MB/s
12#silesia.tar       : 211984896 ->  77274453 (2.743),   3.3 MB/s ,2390.0 MB/s

and now using LZ4 v1.8.1 :

./lz4 -b9e12 -v ~/dev/bench/silesia.tar
*** LZ4 command line interface 64-bits v1.8.1, by Yann Collet ***
Benchmarking levels from 9 to 12
 9#silesia.tar       : 211984896 ->  77890594 (2.722),  24.4 MB/s ,2405.2 MB/s
10#silesia.tar       : 211984896 ->  77859538 (2.723),  19.3 MB/s ,2476.0 MB/s
11#silesia.tar       : 211984896 ->  77369725 (2.740),  10.1 MB/s ,2478.4 MB/s
12#silesia.tar       : 211984896 ->  77270146 (2.743),   3.7 MB/s ,2508.3 MB/s

The new parser is also directly compatible with lower compression levels, which brings additional benefits :

  • Compatibility with LZ4_*_destSize() variant, which reverses the logic by trying to fit as much data as possible into a predefined limited size buffer.
  • Compatibility with Dictionary compression, as it uses the same tables as regular HC mode

In the future, this compatibility will also allow dynamic on-the-fly change of compression level, but such feature is not implemented at this stage.

The release also provides a set of small bug fixes and improvements, listed below :

  • perf : faster and stronger ultra modes (levels 10+)
  • perf : slightly faster compression and decompression speed
  • perf : fix bad degenerative case, reported by @c-morgenstern
  • fix : decompression failed when using a combination of extDict + low memory address (#397), reported and fixed by Julian Scheid (@jscheid)
  • cli : support for dictionary compression (-D), by Felix Handte @felixhandte
  • cli : fix : lz4 -d --rm preserves timestamp (#441)
  • cli : fix : do not modify /dev/null permission as root, by @aliceatlas
  • api : new dictionary api in lib/lz4frame_static.h
  • api : _destSize() variant supported for all compression levels
  • build : make and make test compatible with parallel build -jX, reported by @mwgamera
  • build : can control LZ4LIB_VISIBILITY macro, by @mikir
  • install: fix man page directory (#387), reported by Stuart Cardall (@itoffshore)

Note : v1.8.1.2 is the same as v.1.8.1, with the version number fixed in source code, as notified by Po-Chuan Hsieh (@sunpoet).

@Cyan4973 Cyan4973 released this Jan 13, 2018

Assets 2

Prefer using v1.8.1.2.
It's the same as v1.8.1, but the version number in source code has been fixed, thanks to @sunpoet.
The version number is used in cli and documentation display, to create the full name of dynamic library, and can be requested via LZ4_versionNumber().

@Cyan4973 Cyan4973 released this Aug 18, 2017

Assets 4

cli : fix : do not modify /dev/null permissions, reported by @Maokaman1
cli : added GNU separator -- specifying that all following arguments are only files
cli : restored -BX command enabling block checksum
API : added LZ4_compress_HC_destSize(), by @remittor
API : added LZ4F_resetDecompressionContext()
API : lz4frame : negative compression levels trigger fast acceleration, request by @llchan
API : lz4frame : can control block checksum and dictionary ID
API : fix : expose obsolete decoding functions, reported by @cyfdecyf
API : experimental : lz4frame_static.h : new dictionary compression API
build : fix : static lib installation, by @ido
build : dragonFlyBSD, OpenBSD, NetBSD supported
build : LZ4_MEMORY_USAGE can be modified at compile time, through external define
doc : Updated LZ4 Frame format to v1.6.0, restoring Dictionary-ID field in header
doc : lz4's API manual in .html format, by @inikep

@Cyan4973 Cyan4973 released this Jan 3, 2017

Assets 4

lz4hc : new high compression mode, by @inikep : levels 10-12 compress more (and slower), 12 is highest level
lz4cat : fix : works with relative path (#284) and stdin (#285) (reported by @beiDei8z)
cli : fix minor notification when using -r recursive mode
API : lz4frame : LZ4F_compressBound(0) provides upper bound of *flush() and *End() (#290, #280)
doc : markdown version of man page, by @t-mat (#279)
build : Makefile : fix make -jX lib+exe concurrency (#277)
build : cmake : improvements by @mgorny (#296)

Update : earlier versions of pre-compiled Windows binaries had a bug which made them unable to decode files > 2 GB. The new binaries available below fix this issue.

@Cyan4973 Cyan4973 released this Nov 22, 2016

Assets 4

fix : Makefile : release build compatible with PIE and customized compilation directives provided through environment variables (#274, reported by @totaam)

note : source code is unchanged, therefore library version is unchanged (v1.7.4)

@Cyan4973 Cyan4973 released this Nov 22, 2016

Assets 4

cli : fix : Large file support in 32-bits mode on Mac OS-X
compiler : fix : compilation on gcc 4.4 ( #272 ), reported by @totaam
Improved : much better speed in -mx32 mode

@Cyan4973 Cyan4973 released this Nov 16, 2016

Assets 4

Changed : moved to versioning : package, cli and library have same version number
Improved: Small decompression speed boost
Improved: Small compression speed improvement on 64-bits systems
Improved: Small compression ratio and speed improvement on small files
Improved: Significant speed boost on ARMv6 and ARMv7
Fix : better ratio on 64-bits big-endian targets
Improved cmake build script, by @nemequ
New liblz4-dll project, by @inikep
Makefile: Generates object files (*.o) for faster (re)compilation on low power systems
cli : new : --rm and --help commands
cli : new : preserved file attributes, by @inikep
cli : fix : crash on some invalid inputs
cli : fix : -t correctly validates lz4-compressed files, by @terrelln
cli : fix : detects and reports fread() errors, thanks to @iyokan report #243
cli : bench : new : -r recursive mode
lz4cat : can cat multiple files in a single command line (#184)
Added : doc/lz4_manual.html, by @inikep
Added : dictionary compression and frame decompression examples, by @terrelln
Added : Debianization, by @bioothod

@Cyan4973 Cyan4973 released this Jun 30, 2015 · 1207 commits to master since this release

Assets 2

New : Dos/DJGPP target, thanks to Louis Santillan (#114)
Added : Example using lz4frame library, by Zbigniew Jędrzejewski-Szmek (#118)
Changed: liblz4 : xxhash symbols are dynamically changed (namespace emulation) to avoid symbol conflict
Changed: liblz4.a (static library) no longer compiled with -fPIC by default