- Properly handle non-blocking I/O and partial writes for objects implementing
io.RawIOBase
. - Consider making reads across frames configurable behavior.
- Overall API design review.
- Use Python allocator where possible.
- Figure out what to do about experimental APIs not implemented by CFFI.
- APIs for auto adjusting compression parameters based on input size. e.g. clamping the window log so it isn't too large for input.
- Consider allowing compressor and decompressor instances to be thread safe, support concurrent operations. Or track when an operation is in progress and refuse to let concurrent operations use the same instance.
- Support for magic-less frames for all decompression operations (
decompress()
doesn't work due to sniffing the content size and the lack of a ZSTD API to sniff magic-less frames - this should be fixed in 1.3.5.). - Audit for complete flushing when ending compression streams.
- Deprecate legacy APIs.
- Audit for ability to control read/write sizes on all APIs.
- Detect memory leaks via bench.py.
- Remove low-level compression parameters from
ZstdCompressor.__init__
and require use ofZstdCompressionParameters
. - Consider a
chunker()
API for decompression. - Consider stats for
chunker()
API, including finding the last consumed offset of input data. - Consider controls over resetting compression contexts (session only, parameters, or session and parameters).
- Utilize
ZSTD_getDictID_fromCDict()
? - Stop relying on private libzstd headers and symbols (namely
pool.h
).
- Support for block compression APIs.
- API for ensuring max memory ceiling isn't exceeded.
- Move off nose for testing.
- We now use a non-rc version of cffi 1.17 on all Python versions. Python <=3.12 have cffi upgraded from cffi 1.16 -> 1.17.
- The pyproject.toml file now defines a [project] section.
- Bundled zstd library upgraded from 1.5.5 to 1.5.6.
- Releases now publish wheels for
manylinux2014_ppc64le
,manylinux2014_s390x
,musllinux_1_2_aarch64
,musllinux_1_2_i686
,musllinux_1_2_ppc64le
,musllinux_1_2_s390x
, andmusllinux_1_2_x86_64
. - PyO3 Rust crate upgraded from 0.18 to 0.21.
- Semi official support for CPython 3.13. Binary wheels for 3.13 are now published during releases. There were no meaningful code changes to support Python 3.13. Support is semi official since 3.13 is still in beta and 3.13 is currently being built against a pre-release version of cffi 1.17. We also lack a Rust extension for 3.13 since PyO3 lacks a release with 3.13 support.
pyproject.toml
now lists version constraints of [build-system] requirements, not exact versions. This should provide more compatibility with more environments.setuptools
is held back before 69.0.0 because that version apparently broke support for using--global-settings=--build-option
in editable installs, which our CI relies on.
ZstdDecompressor.decompressobj()
will changeread_across_frames
to default toTrue
in a future release. If you depend on the current functionality of stopping at frame boundaries, start explicitly passingread_across_frames=False
to preserve the current behavior.manylinux2010
wheels are no longer published since this wheel format is no longer supported by the pypa/manylinux project.- Removed CI coverage for PyPy 3.7 and 3.8, which are no longer supported PyPy versions.
- Support for Python 3.7 has been dropped because it reached end of life.
Python 3.8 is the minimum supported Python version. The code should still be
compatible with Python 3.7 and removing of version checks from
setup.py
will likely yield a working install. However, this is no officially supported.
ZstdDecompressor.decompressobj()
now accepts aread_across_frames
boolean named argument to control whether to transparently read across multiple zstd frames. It defaults toFalse
to preserve existing behavior.- Added CI coverage for PyPy 3.10.
- Added CI coverage for newer Anaconda Python versions.
- Packages used in CI have been upgraded to latest versions. This should nominally only impact developers of this project and not end-users.
pyproject.toml
now declares a[build-system]
section saying to build with setuptools.- CI now builds wheels with pip instead of
setup.py
directly. - Official support for CPython 3.12. Binary wheels for 3.12 are now published during releases. There were no meaningful code changes to support Python 3.12.
- Binary wheels for musllinux_1_1 x86_64 and aarch64 are now being built and published.
- Support for Python 3.6 has been dropped. Python 3.7 is the minimum supported Python version.
- Bundled zstd library upgraded from 1.5.4 to 1.5.5.
- PyO3 Rust crate upgraded from 0.15 to 0.18.
- CI environment changed from Ubuntu 20.04 -> 22.04, Windows 2019 -> 2022, macOS 11 -> macOS 12.
- C types now use
PyType_Spec
and corresponding APIs. (#187) Contributed by Mike Hommey.
- This will likely be the last release officially supporting Python 3.6. Python 3.6 is end of life as of 2021-12-23.
- Bundled zstd library upgraded from 1.5.2 to 1.5.4.
- Use of the deprecated
ZSTD_copyDCtx()
was removed from the C and Rust backends.
- The C backend implementation of
ZstdDecompressionObj.decompress()
could have raised an assertion in cases where the function was called multiple times on an instance. In non-debug builds, calls to this method could have leaked memory.
- PyPy 3.6 support dropped; Pypy 3.8 and 3.9 support added.
- Anaconda 3.6 support dropped.
- Official support for Python 3.11. This did not require meaningful code changes and previous release(s) likely worked with 3.11 without any changes.
- CFFI's build system now respects distutils's
compiler.preprocessor
if it is set. (#179) - The internal logic of
ZstdDecompressionObj.decompress()
was refactored. This may have fixed unconfirmed issues whereunused_data
was set prematurely. The new logic will also avoid an extra call toZSTD_decompressStream()
in some scenarios, possibly improving performance. ZstdDecompressor.decompress()
how has aread_across_frames
keyword argument. It defaults to False. True is not yet implemented and will raise an exception if used. The new argument will default to True in a future release and is provided now so callers can start passingread_across_frames=False
to preserve the existing functionality during a future upgrade.ZstdDecompressor.decompress()
now has anallow_extra_data
keyword argument to control whether an exception is raised if input contains extra data. It defaults to True, preserving existing behavior of ignoring extra data. It will likely default to False in a future release. Callers desiring the current behavior are encouraged to explicitly passallow_extra_data=True
so behavior won't change during a future upgrade.
- Bundled zstd library upgraded from 1.5.1 to 1.5.2.
ZstdDecompressionObj
now has anunused_data
attribute. It will contain data beyond the fully decoded zstd frame data if said data exists.ZstdDecompressionObj
now has anunconsumed_tail
attribute. This attribute currently always returns the empty bytes value (b""
).ZstdDecompressionObj
now has aneof
attribute returning whether the compressed data has been fully read.
ZstdCompressionWriter
andZstdDecompressionWriter
now implement__iter__()
and__next__()
. The methods always raiseio.UnsupportedOperation
. The added methods are part of theio.IOBase
abstract base class / interface and help ensure instances look like other I/O types. (#167, #168)- The
HASHLOG3_MAX
constant has been removed since it is no longer defined in zstd 1.5.1.
- The
ZstdCompressionReader
,ZstdCompressionWriter
,ZstdDecompressionReader
, andZstdDecompressionWriter
types in the C backend now tracks theirclosed
attribute using the proper C type. Before, due to a mismatch between the C struct type and the type declared to Python, Python could read the wrong bits on platforms like s390x and incorrectly report the value of theclosed
attribute to Python. (#105, #164)
- Bundled zstd library upgraded from 1.5.0 to 1.5.1.
- The C backend now exposes the symbols
ZstdCompressionReader
,ZstdCompressionWriter
,ZstdDecompressionReader
, andZstdDecompressionWriter
. This should match the behavior of the CFFI backend. (#165) ZstdCompressionWriter
andZstdDecompressionWriter
now implement__iter__
and__next__
, which always raiseio.UnsupportedOperation
.- Documentation on thread safety has been updated to note that derived objects
like
ZstdCompressionWriter
have the same thread unsafety as the contexts they were derived from. (#166)
- Support for Python 3.5 has been dropped. Python 3.6 is now the minimum required Python version.
- Bundled zstd library upgraded from 1.4.8 to 1.5.0.
manylinux2014_aarch64
wheels are now being produced for CPython 3.6+. (#145).- Wheels are now being produced for CPython 3.10.
- Arguments to
ZstdCompressor()
andZstdDecompressor()
are now all optional in the C backend and an explicitNone
value is accepted. Before, the C backend wouldn't accept an explicitNone
value (but the CFFI backend would). The new behavior should be consistent between the backends. (#153)
ZstdCompressor.multi_compress_to_buffer()
andZstdDecompressor.multi_decompress_to_buffer()
are no longer available when linking against a system zstd library. These experimental features are only available when building against the bundled single file zstd C source file distribution. (#106)
setup.py
now recognizes aZSTD_EXTRA_COMPILER_ARGS
environment variable to specify additional compiler arguments to use when compiling the C backend.- PyPy build and test coverage has been added to CI.
- Added CI jobs for building against external zstd library.
- Wheels supporting macOS ARM/M1 devices are now being produced.
- References to Python 2 have been removed from the in-repo Debian packaging code.
- Significant work has been made on a Rust backend. It is currently feature complete but not yet optimized. We are not yet shipping the backend as part of the distributed wheels until it is more mature.
- The
.pyi
type annotations file has replaced various default argument values with...
.
setup.py
no longer attempts to build the C backend on PyPy. (#130)<sys/types.h>
is now included before<sys/sysctl.h>
. This was the case in releases prior to 0.15.0 and the include order was reversed as part of runningclang-format
. The old/working order has been restored. (#128)- Include some private zstd C headers so we can build the C extension against a system library. The previous behavior of referencing these headers is restored. That behave is rather questionable and undermines the desire to use the system zstd.
- Support for Python 2.7 has been dropped. Python 3.5 is now the minimum required Python version. (#109)
train_dictionary()
now uses thefastcover
training mechanism (as opposed tocover
). Some parameter values that worked with the old mechanism may not work with the new one. e.g.d
must be6
or8
if it is defined.train_dictionary()
now always callsZDICT_optimizeTrainFromBuffer_fastCover()
instead of different APIs depending on which arguments were passed.- The names of various Python modules have been changed. The C extension
is now built as
zstandard.backend_c
instead ofzstd
. The CFFI extension module is now built aszstandard._cffi
instead of_zstd_cffi
. The CFFI backend is nowzstandard.backend_cffi
instead ofzstandard.cffi
. ZstdDecompressionReader.seekable()
now returnsFalse
instead ofTrue
because not all seek operations are supported and some Python code in the wild keys off this value to determine ifseek()
can be called for all scenarios.ZstdDecompressionReader.seek()
now raisesOSError
instead ofValueError
when the seek cannot be fulfilled. (#107)ZstdDecompressionReader.readline()
andZstdDecompressionReader.readlines()
now accept an integer argument. This makes them conform with the IO interface. The methods still raiseio.UnsupportedOperation
.ZstdCompressionReader.__enter__
andZstdDecompressionReader.__enter__
now raiseValueError
if the instance was already closed.- The deprecated
overlap_size_log
attribute onZstdCompressionParameters
instances has been removed. Theoverlap_log
attribute should be used instead. - The deprecated
overlap_size_log
argument toZstdCompressionParameters
has been removed. Theoverlap_log
argument should be used instead. - The deprecated
ldm_hash_every_log
attribute onZstdCompressionParameters
instances has been removed. Theldm_hash_rate_log
attribute should be used instead. - The deprecated
ldm_hash_every_log
argument toZstdCompressionParameters
has been removed. Theldm_hash_rate_log
argument should be used instead. - The deprecated
CompressionParameters
type alias toZstdCompressionParamaters
has been removed. UseZstdCompressionParameters
. - The deprecated aliases
ZstdCompressor.read_from()
andZstdDecompressor.read_from()
have been removed. Use the correspondingread_to_iter()
methods instead. - The deprecated aliases
ZstdCompressor.write_to()
andZstdDecompressor.write_to()
have been removed. Use the correspondingstream_writer()
methods instead. ZstdCompressor.copy_stream()
,ZstdCompressorIterator.__next__()
, andZstdDecompressor.copy_stream()
now raise the original exception on error calling the source stream'sread()
instead of raisingZstdError
. This only affects the C backend.ZstdDecompressionObj.flush()
now returnsbytes
instead ofNone
. This makes it behave more similarly toflush()
methods for similar types in the Python standard library. (#78)ZstdCompressionWriter.__exit__()
now always callsclose()
. Previously,close()
would not be called if the context manager raised an exception. The old behavior was inconsistent with other stream types in this package and with the behavior of Python's standard library IO types. (#86)- Distribution metadata no longer lists
cffi
as aninstall_requires
except when running on PyPy. Instead,cffi
is listed as anextras_require
. ZstdCompressor.stream_reader()
andZstdDecompressor.stream_reader()
now default to closing the source stream when the instance is itself closed. To change this behavior, passclosefd=False
. (#76)- The
CFFI
backend now definesZstdCompressor.multi_compress_to_buffer()
andZstdDecompressor.multi_decompress_to_buffer()
. However, they raiseNotImplementedError
, as they are not yet implemented. - The
CFFI
backend now exposes the typesZstdCompressionChunker
,ZstdCompressionObj
,ZstdCompressionReader
,ZstdCompressionWriter
,ZstdDecompressionObj
,ZstdDecompressionReader
, andZstdDecompressionWriter
as symbols on thezstandard
module. - The
CFFI
backend now exposes the typesBufferSegment
,BufferSegments
,BufferWithSegments
, andBufferWithSegmentsCollection
. However, they are not implemented. ZstdCompressionWriter.flush()
now callsflush()
on the inner stream if such a method exists. However, whenclose()
itself callsself.flush()
,flush()
is not called on the inner stream.ZstdDecompressionWriter.close()
no longer callsflush()
on the inner stream. However,ZstdDecompressionWriter.flush()
still callsflush()
on the inner stream.ZstdCompressor.stream_writer()
andZstdDecompressor.stream_writer()
now have theirwrite_return_read
argument default toTrue
. This brings the behavior ofwrite()
in compliance with theio.RawIOBase
interface by default. The argument may be removed in a future release.ZstdCompressionParameters
no longer exposes acompression_strategy
property. Its constructor no longer accepts acompression_strategy
argument. Use thestrategy
property/argument instead.
- Fix a memory leak in
stream_reader
decompressor when reader is closed before reading everything. (Patch by Pierre Fersing.) - The C backend now properly checks for errors after calling IO methods
on inner streams in various methods.
ZstdCompressionWriter.write()
now catches exceptions when calling the inner stream'swrite()
.ZstdCompressionWriter.flush()
on inner stream'swrite()
.ZstdCompressor.copy_stream()
on dest stream'swrite()
.ZstdDecompressionWriter.write()
on inner stream'swrite()
.ZstdDecompressor.copy_stream()
on dest stream'swrite()
. (#102)
- Bundled zstandard library upgraded from 1.4.5 to 1.4.8.
- The bundled zstandard library is now using the single C source file distribution. The 2 main header files are still present, as these are needed by CFFI to generate the CFFI bindings.
PyBuffer
instances are no longer checked to be C contiguous and have a single dimension. The former was redundant with whatPyArg_ParseTuple()
already did and the latter is not necessary in practice because very few extension modules create buffers with more than 1 dimension. (#124)- Added Python typing stub file for the
zstandard
module. (#120) - The
make_cffi.py
script should now respect theCC
environment variable for locating the compiler. (#103) - CI now properly uses the
cffi
backend when running all tests. train_dictionary()
has been rewritten to use thefastcover
APIs and to consistently callZDICT_optimizeTrainFromBuffer_fastCover()
instead of different C APIs depending on what arguments were passed. The function also now accepts argumentsf
,split_point
, andaccel
, which are parameters unique tofastcover
.- CI now tests and builds wheels for Python 3.9.
zstd.c
file renamed toc-ext/backend_c.c
.- All built/installed Python modules are now in the
zstandard
package. Previously, there were modules in other packages. (#115) - C source code is now automatically formatted with
clang-format
. ZstdCompressor.stream_writer()
,ZstdCompressor.stream_reader()
,ZstdDecompressor.stream_writer()
, andZstdDecompressor.stream_reader()
now accept aclosefd
argument to control whether the underlying stream should be closed when theZstdCompressionWriter
,ZstdCompressReader
,ZstdDecompressionWriter
, orZstdDecompressionReader
is closed. (#76)- There is now a
zstandard.open()
function for returning a file object with zstd (de)compression. (#64) - The
zstandard
module now exposes abackend_features
attribute containing a set of strings denoting optional features present in that backend. This can be used to sniff feature support by performing a string lookup instead of sniffing for API presence or behavior. - Python docstrings have been moved from the C backend to the CFFI backend. Sphinx docs have been updated to generate API documentation via the CFFI backend. Documentation for Python APIs is now fully defined via Python docstrings instead of spread across Sphinx ReST files and source code.
ZstdCompressionParameters
now exposes astrategy
property.- There are now
compress()
anddecompress()
convenience functions on thezstandard
module. These are simply wrappers around the corresponding APIs onZstdCompressor
andZstdDecompressor
.
- Python 3.9 wheels are now provided.
- This will likely be the final version supporting Python 2.7. Future releases will likely only work on Python 3.5+. See #109 for more context.
- There is a significant possibility that future versions will use Rust - instead of C - for compiled code. See #110 for more context.
- Some internal fields of C structs are now explicitly initialized. (Possible fix for #105.)
- The
make_cffi.py
script used to build the CFFI bindings now callsdistutils.sysconfig.customize_compiler()
so compiler customizations (such as honoring theCC
environment variable) are performed. Patch by @Arfrever. (#103) - The
make_cffi.py
script now setsLC_ALL=C
when invoking the preprocessor in an attempt to normalize output to ASCII. (#95)
- Bundled zstandard library upgraded from 1.4.4 to 1.4.5.
setup.py
is now executable.- Python code reformatted with black using 80 character line lengths.
pytest-xdist
pytest
extension is now installed so tests can be run in parallel.- CI now builds
manylinux2010
andmanylinux2014
binary wheels instead of a mix ofmanylinux2010
andmanylinux1
. - Official support for Python 3.8 has been added.
- Bundled zstandard library upgraded from 1.4.3 to 1.4.4.
- Python code has been reformatted with black.
- Support for Python 3.4 has been dropped since Python 3.4 is no longer a supported Python version upstream. (But it will likely continue to work until Python 2.7 support is dropped and we port to Python 3.5+ APIs.)
- Fix
ZstdDecompressor.__init__
on 64-bit big-endian systems (#91). - Fix memory leak in
ZstdDecompressionReader.seek()
(#82).
- CI transitioned to Azure Pipelines (from AppVeyor and Travis CI).
- Switched to
pytest
for running tests (fromnose
). - Bundled zstandard library upgraded from 1.3.8 to 1.4.3.
- Fix memory leak in
ZstdDecompressionReader.seek()
(#82).
ZstdDecompressor.read()
now allows reading sizes of-1
or0
and defaults to-1
, per the documented behavior ofio.RawIOBase.read()
. Previously, we required an argument that was a positive value.- The
readline()
,readlines()
,__iter__
, and__next__
methods ofZstdDecompressionReader()
now raiseio.UnsupportedOperation
instead ofNotImplementedError
. ZstdDecompressor.stream_reader()
now accepts aread_across_frames
argument. The default value will likely be changed in a future release and consumers are advised to pass the argument to avoid unwanted change of behavior in the future.setup.py
now always disables the CFFI backend if the installed CFFI package does not meet the minimum version requirements. Before, it was possible for the CFFI backend to be generated and a run-time error to occur.- In the CFFI backend,
CompressionReader
andDecompressionReader
were renamed toZstdCompressionReader
andZstdDecompressionReader
, respectively so naming is identical to the C extension. This should have no meaningful end-user impact, as instances aren't meant to be constructed directly. ZstdDecompressor.stream_writer()
now accepts awrite_return_read
argument to control whetherwrite()
returns the number of bytes read from the source / written to the decompressor. It defaults to off, which preserves the existing behavior of returning the number of bytes emitted from the decompressor. The default will change in a future release so behavior aligns with the specified behavior ofio.RawIOBase
.ZstdDecompressionWriter.__exit__
now callsself.close()
. This will result in that stream plus the underlying stream being closed as well. If this behavior is not desirable, do not use instances as context managers.ZstdCompressor.stream_writer()
now accepts awrite_return_read
argument to control whetherwrite()
returns the number of bytes read from the source / written to the compressor. It defaults to off, which preserves the existing behavior of returning the number of bytes emitted from the compressor. The default will change in a future release so behavior aligns with the specified behavior ofio.RawIOBase
.ZstdCompressionWriter.__exit__
now callsself.close()
. This will result in that stream plus any underlying stream being closed as well. If this behavior is not desirable, do not use instances as context managers.ZstdDecompressionWriter
no longer requires being used as a context manager (#57).ZstdCompressionWriter
no longer requires being used as a context manager (#57).- The
overlap_size_log
attribute onCompressionParameters
instances has been deprecated and will be removed in a future release. Theoverlap_log
attribute should be used instead. - The
overlap_size_log
argument toCompressionParameters
has been deprecated and will be removed in a future release. Theoverlap_log
argument should be used instead. - The
ldm_hash_every_log
attribute onCompressionParameters
instances has been deprecated and will be removed in a future release. Theldm_hash_rate_log
attribute should be used instead. - The
ldm_hash_every_log
argument toCompressionParameters
has been deprecated and will be removed in a future release. Theldm_hash_rate_log
argument should be used instead. - The
compression_strategy
argument toCompressionParameters
has been deprecated and will be removed in a future release. Thestrategy
argument should be used instead. - The
SEARCHLENGTH_MIN
andSEARCHLENGTH_MAX
constants are deprecated and will be removed in a future release. UseMINMATCH_MIN
andMINMATCH_MAX
instead. - The
zstd_cffi
module has been renamed tozstandard.cffi
. As had been documented in theREADME
file since the0.9.0
release, the module should not be imported directly at its new location. Instead,import zstandard
to cause an appropriate backend module to be loaded automatically.
- CFFI backend could encounter a failure when sending an empty chunk into
ZstdDecompressionObj.decompress()
. The issue has been fixed. - CFFI backend could encounter an error when calling
ZstdDecompressionReader.read()
if there was data remaining in an internal buffer. The issue has been fixed. (#71)
ZstDecompressionObj.decompress()
now properly handles empty inputs in the CFFI backend.ZstdCompressionReader
now implementsread1()
andreadinto1()
. These are part of theio.BufferedIOBase
interface.ZstdCompressionReader
has gained areadinto(b)
method for reading compressed output into an existing buffer.ZstdCompressionReader.read()
now defaults tosize=-1
and accepts read sizes of-1
and0
. The new behavior aligns with the documented behavior ofio.RawIOBase
.ZstdCompressionReader
now implementsreadall()
. Previously, this method raisedNotImplementedError
.ZstdDecompressionReader
now implementsread1()
andreadinto1()
. These are part of theio.BufferedIOBase
interface.ZstdDecompressionReader.read()
now defaults tosize=-1
and accepts read sizes of-1
and0
. The new behavior aligns with the documented behavior ofio.RawIOBase
.ZstdDecompressionReader()
now implementsreadall()
. Previously, this method raisedNotImplementedError
.- The
readline()
,readlines()
,__iter__
, and__next__
methods ofZstdDecompressionReader()
now raiseio.UnsupportedOperation
instead ofNotImplementedError
. This reflects a decision to never implement text-based I/O on (de)compressors and keep the low-level API operating in the binary domain. (#13) README.rst
now documented how to achieve linewise iteration using anio.TextIOWrapper
with aZstdDecompressionReader
.ZstdDecompressionReader
has gained areadinto(b)
method for reading decompressed output into an existing buffer. This allows chaining to anio.TextIOWrapper
on Python 3 without using anio.BufferedReader
.ZstdDecompressor.stream_reader()
now accepts aread_across_frames
argument to control behavior when the input data has multiple zstd frames. WhenFalse
(the default for backwards compatibility), aread()
will stop when the end of a zstd frame is encountered. WhenTrue
,read()
can potentially return data spanning multiple zstd frames. The default will likely be changed toTrue
in a future release.setup.py
now performs CFFI version sniffing and disables the CFFI backend if CFFI is too old. Previously, we only usedinstall_requires
to enforce the CFFI version and not all build modes would properly enforce the minimum CFFI version. (#69)- CFFI's
ZstdDecompressionReader.read()
now properly handles data remaining in any internal buffer. Before, repeatedread()
could result in random errors. (#71) - Upgraded various Python packages in CI environment.
- Upgrade to hypothesis 4.5.11.
- In the CFFI backend,
CompressionReader
andDecompressionReader
were renamed toZstdCompressionReader
andZstdDecompressionReader
, respectively. ZstdDecompressor.stream_writer()
now accepts awrite_return_read
argument to control whetherwrite()
returns the number of bytes read from the source. It defaults toFalse
to preserve backwards compatibility.ZstdDecompressor.stream_writer()
now implements theio.RawIOBase
interface and behaves as a proper stream object.ZstdCompressor.stream_writer()
now accepts awrite_return_read
argument to control whetherwrite()
returns the number of bytes read from the source. It defaults toFalse
to preserve backwards compatibility.ZstdCompressionWriter
now implements theio.RawIOBase
interface and behaves as a proper stream object.close()
will now close the stream and the underlying stream (if possible).__exit__
will now callclose()
. Methods likewritable()
andfileno()
are implemented.ZstdDecompressionWriter
no longer must be used as a context manager.ZstdCompressionWriter
no longer must be used as a context manager. When not using as a context manager, it is important to callflush(FRAME_FRAME)
or the compression stream won't be properly terminated and decoders may complain about malformed input.ZstdCompressionWriter.flush()
(what is returned fromZstdCompressor.stream_writer()
) now accepts an argument controlling the flush behavior. Its value can be one of the new constantsFLUSH_BLOCK
orFLUSH_FRAME
.ZstdDecompressionObj
instances now have aflush([length=None])
method. This provides parity with standard library equivalent types. (#65)CompressionParameters
no longer redundantly store individual compression parameters on each instance. Instead, compression parameters are stored inside the underlyingZSTD_CCtx_params
instance. Attributes for obtaining parameters are now properties rather than instance variables.- Exposed the
STRATEGY_BTULTRA2
constant. CompressionParameters
instances now expose anoverlap_log
attribute. This behaves identically to theoverlap_size_log
attribute.CompressionParameters()
now accepts anoverlap_log
argument that behaves identically to theoverlap_size_log
argument. An error will be raised if both arguments are specified.CompressionParameters
instances now expose anldm_hash_rate_log
attribute. This behaves identically to theldm_hash_every_log
attribute.CompressionParameters()
now accepts aldm_hash_rate_log
argument that behaves identically to theldm_hash_every_log
argument. An error will be raised if both arguments are specified.CompressionParameters()
now accepts astrategy
argument that behaves identically to thecompression_strategy
argument. An error will be raised if both arguments are specified.- The
MINMATCH_MIN
andMINMATCH_MAX
constants were added. They are semantically equivalent to the oldSEARCHLENGTH_MIN
andSEARCHLENGTH_MAX
constants. - Bundled zstandard library upgraded from 1.3.7 to 1.3.8.
setup.py
denotes support for Python 3.7 (Python 3.7 was supported and tested in the 0.10 release).zstd_cffi
module has been renamed tozstandard.cffi
.ZstdCompressor.stream_writer()
now reuses a buffer in order to avoid allocating a new buffer for every operation. This should result in faster performance in cases wherewrite()
orflush()
are being called frequently. (#62)- Bundled zstandard library upgraded from 1.3.6 to 1.3.7.
zstd_cffi.py
added tosetup.py
(#60).
- Change some integer casts to avoid
ssize_t
(#61).
ZstdCompressor.stream_reader().closed
is now a property instead of a method (#58).ZstdDecompressor.stream_reader().closed
is now a property instead of a method (#58).
- Stop attempting to package Python 3.6 for Miniconda. The latest version of Miniconda is using Python 3.7. The Python 3.6 Miniconda packages were a lie since this were built against Python 3.7.
ZstdCompressor.stream_reader()
's andZstdDecompressor.stream_reader()
'sclosed
attribute is now a read-only property instead of a method. This now properly matches theIOBase
API and allows instances to be used in more places that acceptIOBase
instances.
ZstdDecompressor.stream_reader().read()
now consistently requires an argument in both the C and CFFI backends. Before, the CFFI implementation would assume a default value of-1
, which was later rejected.- The
compress_literals
argument and attribute has been removed fromzstd.ZstdCompressionParameters
because it was removed by the zstd 1.3.5 API. ZSTD_CCtx_setParametersUsingCCtxParams()
is no longer called on every operation performed againstZstdCompressor
instances. The reason for this change is that the zstd 1.3.5 API no longer allows this without callingZSTD_CCtx_resetParameters()
first. But if we calledZSTD_CCtx_resetParameters()
on every operation, we'd have to redo potentially expensive setup when using dictionaries. We now callZSTD_CCtx_reset()
on every operation and don't attempt to change compression parameters.- Objects returned by
ZstdCompressor.stream_reader()
no longer need to be used as a context manager. The context manager interface still exists and its behavior is unchanged. - Objects returned by
ZstdDecompressor.stream_reader()
no longer need to be used as a context manager. The context manager interface still exists and its behavior is unchanged.
ZstdDecompressor.decompressobj().decompress()
should now return all data from internal buffers in more scenarios. Before, it was possible for data to remain in internal buffers. This data would be emitted on a subsequent call todecompress()
. The overall output stream would still be valid. But if callers were expecting input data to exactly map to output data (say the producer had usedflush(COMPRESSOBJ_FLUSH_BLOCK)
and was attempting to map input chunks to output chunks), then the previous behavior would be wrong. The new behavior is such that output fromflush(COMPRESSOBJ_FLUSH_BLOCK)
fed intodecompressobj().decompress()
should produce all available compressed input.ZstdDecompressor.stream_reader().read()
should no longer segfault after a previous context manager resulted in error (#56).ZstdCompressor.compressobj().flush(COMPRESSOBJ_FLUSH_BLOCK)
now returns all data necessary to flush a block. Before, it was possible for theflush()
to not emit all data necessary to fully represent a block. This would mean decompressors wouldn't be able to decompress all data that had been fed into the compressor andflush()
'ed. (#55).
- New module constants
BLOCKSIZELOG_MAX
,BLOCKSIZE_MAX
,TARGETLENGTH_MAX
that expose constants from libzstd. - New
ZstdCompressor.chunker()
API for manually feeding data into a compressor and emitting chunks of a fixed size. Likecompressobj()
, the API doesn't impose restrictions on the input or output types for the data streams. Unlikecompressobj()
, it ensures output chunks are of a fixed size. This makes this API useful when the compressed output is being fed into an I/O layer, where uniform write sizes are useful. ZstdCompressor.stream_reader()
no longer needs to be used as a context manager (#34).ZstdDecompressor.stream_reader()
no longer needs to be used as a context manager (#34).- Bundled zstandard library upgraded from 1.3.4 to 1.3.6.
- Added
zstd_cffi.py
andNEWS.rst
toMANIFEST.in
. zstandard.__version__
is now defined (#50).- Upgrade pip, setuptools, wheel, and cibuildwheel packages to latest versions.
- Upgrade various packages used in CI to latest versions. Notably tox (in order to support Python 3.7).
- Use relative paths in setup.py to appease Python 3.7 (#51).
- Added CI for Python 3.7.
- Debian packaging support.
- Fix typo in setup.py (#44).
- Support building with mingw compiler (#46).
- CFFI 1.11 or newer is now required (previous requirement was 1.8).
- The primary module is now
zstandard
. Please change imports ofzstd
andzstd_cffi
toimport zstandard
. See the README for more. Support for importing the old names will be dropped in the next release. ZstdCompressor.read_from()
andZstdDecompressor.read_from()
have been renamed toread_to_iter()
.read_from()
is aliased to the new name and will be deleted in a future release.- Support for Python 2.6 has been removed.
- Support for Python 3.3 has been removed.
- The
selectivity
argument totrain_dictionary()
has been removed, as the feature disappeared from zstd 1.3. - Support for legacy dictionaries has been removed. Cover dictionaries are now
the default.
train_cover_dictionary()
has effectively been renamed totrain_dictionary()
. - The
allow_empty
argument fromZstdCompressor.compress()
has been deleted and the method now allows empty inputs to be compressed by default. estimate_compression_context_size()
has been removed. UseCompressionParameters.estimated_compression_context_size()
instead.get_compression_parameters()
has been removed. UseCompressionParameters.from_level()
instead.- The arguments to
CompressionParameters.__init__()
have changed. If you were using positional arguments before, the positions now map to different arguments. It is recommended to use keyword arguments to constructCompressionParameters
instances. TARGETLENGTH_MAX
constant has been removed (it disappeared from zstandard 1.3.4).ZstdCompressor.write_to()
andZstdDecompressor.write_to()
have been renamed toZstdCompressor.stream_writer()
andZstdDecompressor.stream_writer()
, respectively. The old names are still aliased, but will be removed in the next major release.- Content sizes are written into frame headers by default
(
ZstdCompressor(write_content_size=True)
is now the default). CompressionParameters
has been renamed toZstdCompressionParameters
for consistency with other types. The old name is an alias and will be removed in the next major release.
- Fixed memory leak in
ZstdCompressor.copy_stream()
(#40) (from 0.8.2). - Fixed memory leak in
ZstdDecompressor.copy_stream()
(#35) (from 0.8.2). - Fixed memory leak of
ZSTD_DDict
instances in CFFI'sZstdDecompressor
.
- Bundled zstandard library upgraded from 1.1.3 to 1.3.4. This delivers various bug fixes and performance improvements. It also gives us access to newer features.
- Support for negative compression levels.
- Support for long distance matching (facilitates compression ratios that approach LZMA).
- Supporting for reading empty zstandard frames (with an embedded content size of 0).
- Support for writing and partial support for reading zstandard frames without a magic header.
- New
stream_reader()
API that exposes theio.RawIOBase
interface (allows you to.read()
from a file-like object). - Several minor features, bug fixes, and performance enhancements.
- Wheels for Linux and macOS are now provided with releases.
- Functions accepting bytes data now use the buffer protocol and can accept
more types (like
memoryview
andbytearray
) (#26). - Add #includes so compilation on OS X and BSDs works (#20).
- New
ZstdDecompressor.stream_reader()
API to obtain a read-only i/o stream of decompressed data for a source. - New
ZstdCompressor.stream_reader()
API to obtain a read-only i/o stream of compressed data for a source. - Renamed
ZstdDecompressor.read_from()
toZstdDecompressor.read_to_iter()
. The old name is still available. - Renamed
ZstdCompressor.read_from()
toZstdCompressor.read_to_iter()
.read_from()
is still available at its old location. - Introduce the
zstandard
module to import and re-export the C or CFFI backend as appropriate. Behavior can be controlled via thePYTHON_ZSTANDARD_IMPORT_POLICY
environment variable. See README for usage info. - Vendored version of zstd upgraded to 1.3.4.
- Added module constants
CONTENTSIZE_UNKNOWN
andCONTENTSIZE_ERROR
. - Add
STRATEGY_BTULTRA
compression strategy constant. - Switch from deprecated
ZSTD_getDecompressedSize()
toZSTD_getFrameContentSize()
replacement. ZstdCompressor.compress()
can now compress empty inputs without requiring special handling.ZstdCompressor
andZstdDecompressor
now have amemory_size()
method for determining the current memory utilization of the underlying zstd primitive.train_dictionary()
has new arguments and functionality for trying multiple variations of COVER parameters and selecting the best one.- Added module constants
LDM_MINMATCH_MIN
,LDM_MINMATCH_MAX
, andLDM_BUCKETSIZELOG_MAX
. - Converted all consumers to the zstandard new advanced API, which uses
ZSTD_compress_generic()
CompressionParameters.__init__
now accepts several more arguments, including support for long distance matching.ZstdCompressionDict.__init__
now accepts adict_type
argument that controls how the dictionary should be interpreted. This can be used to force the use of content-only dictionaries or to require the presence of the dictionary magic header.ZstdCompressionDict.precompute_compress()
can be used to precompute the compression dictionary so it can efficiently be used with multipleZstdCompressor
instances.- Digested dictionaries are now stored in
ZstdCompressionDict
instances, created automatically on first use, and automatically reused by allZstdDecompressor
instances bound to that dictionary. - All meaningful functions now accept keyword arguments.
ZstdDecompressor.decompressobj()
now accepts awrite_size
argument to control how much work to perform on every decompressor invocation.ZstdCompressor.write_to()
now exposes atell()
, which exposes the total number of bytes written so far.ZstdDecompressor.stream_reader()
now supportsseek()
when moving forward in the stream.- Removed
TARGETLENGTH_MAX
constant. - Added
frame_header_size(data)
function. - Added
frame_content_size(data)
function. - Consumers of
ZSTD_decompress*
have been switched to the new advanced decompression API. ZstdCompressor
andZstdCompressionParams
can now be constructed with negative compression levels.ZstdDecompressor
now accepts amax_window_size
argument to limit the amount of memory required for decompression operations.FORMAT_ZSTD1
andFORMAT_ZSTD1_MAGICLESS
constants to be used with theformat
compression parameter to control whether the frame magic header is written.ZstdDecompressor
now accepts aformat
argument to control the expected frame format.ZstdCompressor
now has aframe_progression()
method to return information about the current compression operation.- Error messages in CFFI no longer have
b''
literals. - Compiler warnings and underlying overflow issues on 32-bit platforms have been fixed.
- Builds in CI now build with compiler warnings as errors. This should hopefully fix new compiler warnings from being introduced.
- Make
ZstdCompressor(write_content_size=True)
andCompressionParameters(write_content_size=True)
the default. CompressionParameters
has been renamed toZstdCompressionParameters
.
- Fixed memory leak in
ZstdCompressor.copy_stream()
(#40). - Fixed memory leak in
ZstdDecompressor.copy_stream()
(#35).
- Add #includes so compilation on OS X and BSDs works (#20).
- CompressionParameters now has a estimated_compression_context_size() method. zstd.estimate_compression_context_size() is now deprecated and slated for removal.
- Implemented a lot of fuzzing tests.
- CompressionParameters instances now perform extra validation by calling ZSTD_checkCParams() at construction time.
- multi_compress_to_buffer() API for compressing multiple inputs as a single operation, as efficiently as possible.
- ZSTD_CStream instances are now used across multiple operations on ZstdCompressor instances, resulting in much better performance for APIs that do streaming.
- ZSTD_DStream instances are now used across multiple operations on ZstdDecompressor instances, resulting in much better performance for APIs that do streaming.
- train_dictionary() now releases the GIL.
- Support for training dictionaries using the COVER algorithm.
- multi_decompress_to_buffer() API for decompressing multiple frames as a single operation, as efficiently as possible.
- Support for multi-threaded compression.
- Disable deprecation warnings when compiling CFFI module.
- Fixed memory leak in train_dictionary().
- Removed DictParameters type.
- train_dictionary() now accepts keyword arguments instead of a DictParameters instance to control dictionary generation.
- Added zstd.get_frame_parameters() to obtain info about a zstd frame.
- Added ZstdDecompressor.decompress_content_dict_chain() for efficient decompression of content-only dictionary chains.
- CFFI module fully implemented; all tests run against both C extension and CFFI implementation.
- Vendored version of zstd updated to 1.1.3.
- Use ZstdDecompressor.decompress() now uses ZSTD_createDDict_byReference() to avoid extra memory allocation of dict data.
- Add function names to error messages (by using ":name" in PyArg_Parse* functions).
- Reuse decompression context across operations. Previously, we created a new ZSTD_DCtx for each decompress(). This was measured to slow down decompression by 40-200MB/s. The API guarantees say ZstdDecompressor is not thread safe. So we reuse the ZSTD_DCtx across operations and make things faster in the process.
- ZstdCompressor.write_to()'s compress() and flush() methods now return number of bytes written.
- ZstdDecompressor.write_to()'s write() method now returns the number of bytes written to the underlying output object.
- CompressionParameters instances now expose their values as attributes.
- CompressionParameters instances no longer are subscriptable nor behave as tuples (backwards incompatible). Use attributes to obtain values.
- DictParameters instances now expose their values as attributes.
- Support for legacy zstd protocols (build time opt in feature).
- Automation improvements to test against Python 3.6, latest versions of Tox, more deterministic AppVeyor behavior.
- CFFI "parser" improved to use a compiler preprocessor instead of rewriting source code manually.
- Vendored version of zstd updated to 1.1.2.
- Documentation improvements.
- Introduce a bench.py script for performing (crude) benchmarks.
- ZSTD_CCtx instances are now reused across multiple compress() operations.
- ZstdCompressor.write_to() now has a flush() method.
- ZstdCompressor.compressobj()'s flush() method now accepts an argument to flush a block (as opposed to ending the stream).
- Disallow compress(b'') when writing content sizes by default (issue #11).
- more packaging fixes for source distribution
- setup_zstd.py is included in the source distribution
- Vendored version of zstd updated to 1.1.1.
- Continuous integration for Python 3.6 and 3.7
- Continuous integration for Conda
- Added compression and decompression APIs providing similar interfaces
to the standard library
zlib
andbz2
modules. This allows coding to a common interface. zstd.__version__
is now defined.read_from()
on various APIs now accepts objects implementing the buffer protocol.read_from()
has gained askip_bytes
argument. This allows callers to pass in an existing buffer with a header without having to create a slice or a new object.- Implemented
ZstdCompressionDict.as_bytes()
. - Python's memory allocator is now used instead of
malloc()
. - Low-level zstd data structures are reused in more instances, cutting down on overhead for certain operations.
distutils
boilerplate for obtaining anExtension
instance has now been refactored into a standalonesetup_zstd.py
file. This allows other projects withsetup.py
files to reuse thedistutils
code for this project without copying code.- The monolithic
zstd.c
file has been split into a header file defining types and separate.c
source files for the implementation.
2016-08-31 - Zstandard 1.0.0 is released and Gregory starts hacking on a Python extension for use by the Mercurial project. A very hacky prototype is sent to the mercurial-devel list for RFC.
2016-09-03 - Most functionality from Zstandard C API implemented. Source code published on https://github.com/indygreg/python-zstandard. Travis-CI automation configured. 0.0.1 release on PyPI.
2016-09-05 - After the API was rounded out a bit and support for Python 2.6 and 2.7 was added, version 0.1 was released to PyPI.
2016-09-05 - After the compressor and decompressor APIs were changed, 0.2 was released to PyPI.
2016-09-10 - 0.3 is released with a bunch of new features. ZstdCompressor now accepts arguments controlling frame parameters. The source size can now be declared when performing streaming compression. ZstdDecompressor.decompress() is implemented. Compression dictionaries are now cached when using the simple compression and decompression APIs. Memory size APIs added. ZstdCompressor.read_from() and ZstdDecompressor.read_from() have been implemented. This rounds out the major compression/decompression APIs planned by the author.
2016-10-02 - 0.3.3 is released with a bug fix for read_from not fully decoding a zstd frame (issue #2).
2016-10-02 - 0.4.0 is released with zstd 1.1.0, support for custom read and write buffer sizes, and a few bug fixes involving failure to read/write all data when buffer sizes were too small to hold remaining data.
2016-11-10 - 0.5.0 is released with zstd 1.1.1 and other enhancements.