-
Notifications
You must be signed in to change notification settings - Fork 73
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #23 from esc/blosc_as_subtree_for_merge
Blosc as subtree for merge
- Loading branch information
Showing
35 changed files
with
2,138 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
bench/bench |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
=============================================================== | ||
Announcing Blosc 1.1.5 | ||
A blocking, shuffling and lossless compression library | ||
=============================================================== | ||
|
||
What is new? | ||
============ | ||
|
||
This is maintenance release fixing an issue that avoided compilation | ||
with MSVC. | ||
|
||
For more info, please see the release notes in: | ||
|
||
https://github.com/FrancescAlted/blosc/wiki/Release-notes | ||
|
||
What is it? | ||
=========== | ||
|
||
Blosc (http://blosc.pytables.org) is a high performance compressor | ||
optimized for binary data. It has been designed to transmit data to | ||
the processor cache faster than the traditional, non-compressed, | ||
direct memory fetch approach via a memcpy() OS call. | ||
|
||
Blosc is the first compressor (that I'm aware of) that is meant not | ||
only to reduce the size of large datasets on-disk or in-memory, but | ||
also to accelerate object manipulations that are memory-bound. | ||
|
||
It also comes with a filter for HDF5 (http://www.hdfgroup.org/HDF5) so | ||
that you can easily implement support for Blosc in your favourite HDF5 | ||
tool. | ||
|
||
Download sources | ||
================ | ||
|
||
Please go to main web site: | ||
|
||
http://blosc.pytables.org/sources/ | ||
|
||
or the github repository: | ||
|
||
https://github.com/FrancescAlted/blosc | ||
|
||
and download the most recent release from there. | ||
|
||
Blosc is distributed using the MIT license, see LICENSES/BLOSC.txt for | ||
details. | ||
|
||
Mailing list | ||
============ | ||
|
||
There is an official Blosc blosc mailing list at: | ||
|
||
blosc@googlegroups.com | ||
http://groups.google.es/group/blosc | ||
|
||
|
||
---- | ||
|
||
**Enjoy data!** |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
Blosc - A blocking, shuffling and lossless compression library | ||
|
||
Copyright (C) 2009-2010 Francesc Alted (faltet@pytables.org) | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in | ||
all copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
THE SOFTWARE. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
FastLZ - lightning-fast lossless compression library | ||
|
||
Copyright (C) 2007 Ariya Hidayat (ariya@kde.org) | ||
Copyright (C) 2006 Ariya Hidayat (ariya@kde.org) | ||
Copyright (C) 2005 Ariya Hidayat (ariya@kde.org) | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in | ||
all copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
THE SOFTWARE. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
Copyright Notice and Statement for the h5py Project | ||
|
||
Copyright (c) 2008 Andrew Collette | ||
http://h5py.alfven.org | ||
All rights reserved. | ||
|
||
Redistribution and use in source and binary forms, with or without | ||
modification, are permitted provided that the following conditions are | ||
met: | ||
|
||
a. Redistributions of source code must retain the above copyright | ||
notice, this list of conditions and the following disclaimer. | ||
|
||
b. Redistributions in binary form must reproduce the above copyright | ||
notice, this list of conditions and the following disclaimer in the | ||
documentation and/or other materials provided with the | ||
distribution. | ||
|
||
c. Neither the name of the author nor the names of contributors may | ||
be used to endorse or promote products derived from this software | ||
without specific prior written permission. | ||
|
||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS | ||
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT | ||
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR | ||
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT | ||
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, | ||
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT | ||
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, | ||
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY | ||
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | ||
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE | ||
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
Copyright (c) 2006-2008 Alexander Chemeris | ||
|
||
Redistribution and use in source and binary forms, with or without | ||
modification, are permitted provided that the following conditions are met: | ||
|
||
1. Redistributions of source code must retain the above copyright notice, | ||
this list of conditions and the following disclaimer. | ||
|
||
2. Redistributions in binary form must reproduce the above copyright | ||
notice, this list of conditions and the following disclaimer in the | ||
documentation and/or other materials provided with the distribution. | ||
|
||
3. The name of the author may be used to endorse or promote products | ||
derived from this software without specific prior written permission. | ||
|
||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR IMPLIED | ||
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF | ||
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO | ||
EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, | ||
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, | ||
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; | ||
OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, | ||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR | ||
OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF | ||
ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,163 @@ | ||
=============================================================== | ||
Blosc: A blocking, shuffling and lossless compression library | ||
=============================================================== | ||
|
||
:Author: Francesc Alted i Abad | ||
:Contact: faltet@pytables.org | ||
:URL: http://blosc.pytables.org | ||
|
||
What is it? | ||
=========== | ||
|
||
Blosc [1]_ is a high performance compressor optimized for binary data. | ||
It has been designed to transmit data to the processor cache faster | ||
than the traditional, non-compressed, direct memory fetch approach via | ||
a memcpy() OS call. Blosc is the first compressor (that I'm aware of) | ||
that is meant not only to reduce the size of large datasets on-disk or | ||
in-memory, but also to accelerate memory-bound computations. | ||
|
||
It uses the blocking technique (as described in [2]_) to reduce | ||
activity on the memory bus as much as possible. In short, this | ||
technique works by dividing datasets in blocks that are small enough | ||
to fit in caches of modern processors and perform compression / | ||
decompression there. It also leverages, if available, SIMD | ||
instructions (SSE2) and multi-threading capabilities of CPUs, in order | ||
to accelerate the compression / decompression process to a maximum. | ||
|
||
You can see some recent benchmarks about Blosc performance in [3]_ | ||
|
||
Blosc is distributed using the MIT license, see LICENSES/BLOSC.txt for | ||
details. | ||
|
||
.. [1] http://blosc.pytables.org | ||
.. [2] http://www.pytables.org/docs/CISE-12-2-ScientificPro.pdf | ||
.. [3] http://blosc.pytables.org/trac/wiki/SyntheticBenchmarks | ||
Meta-compression and other advantages over existing compressors | ||
=============================================================== | ||
|
||
Blosc is not like other compressors: it should rather be called a | ||
meta-compressor. This is so because it can use different compressors | ||
and pre-conditioners (programs that generally improve compression | ||
ratio). At any rate, it can also be called a compressor because it | ||
happens that it already integrates one compressor and one | ||
pre-conditioner, so it can actually work like so. | ||
|
||
Currently it uses BloscLZ, a compressor heavily based on FastLZ | ||
(http://fastlz.org/), and a highly optimized (it can use SSE2 | ||
instructions, if available) Shuffle pre-conditioner. However, | ||
different compressors or pre-conditioners may be added in the future. | ||
|
||
Blosc is in charge of coordinating the compressor and pre-conditioners | ||
so that they can leverage the blocking technique (described above) as | ||
well as multi-threaded execution (if several cores are available) | ||
automatically. That makes that every compressor and pre-conditioner | ||
will work at very high speeds, even if it was not initially designed | ||
for doing blocking or multi-threading. | ||
|
||
Other advantages of Blosc are: | ||
|
||
* Meant for binary data: can take advantage of the type size | ||
meta-information for improved compression ratio (using the | ||
integrated shuffle pre-conditioner). | ||
|
||
* Small overhead on non-compressible data: only a maximum of 16 | ||
additional bytes over the source buffer length are needed to | ||
compress *every* input. | ||
|
||
* Maximum destination length: contrarily to many other | ||
compressors, both compression and decompression routines have | ||
support for maximum size lengths for the destination buffer. | ||
|
||
* Replacement for memcpy(): it supports a 0 compression level that | ||
does not compress at all and only adds 16 bytes of overhead. In | ||
this mode Blosc can copy memory usually faster than a plain | ||
memcpy(). | ||
|
||
When taken together, all these features set Blosc apart from other | ||
similar solutions. | ||
|
||
Compiling your application with Blosc | ||
===================================== | ||
|
||
Blosc consists of the next files (in blosc/ directory):: | ||
|
||
blosc.h and blosc.c -- the main routines | ||
blosclz.h and blosclz.c -- the actual compressor | ||
shuffle.h and shuffle.c -- the shuffle code | ||
|
||
Just add these files to your project in order to use Blosc. For | ||
information on compression and decompression routines, see blosc.h. | ||
|
||
To compile using GCC (4.4 or higher recommended) on Unix:: | ||
|
||
gcc -O3 -msse2 -o myprog myprog.c blosc/*.c -lpthread | ||
|
||
Using Windows and MINGW:: | ||
|
||
gcc -O3 -msse2 -o myprog myprog.c blosc\*.c | ||
|
||
Using Windows and MSVC (2008 or higher recommended):: | ||
|
||
cl /Ox /Femyprog.exe myprog.c blosc\*.c | ||
|
||
A simple usage example is the benchmark in the bench/bench.c file. | ||
Also, another example for using Blosc as a generic HDF5 filter is in | ||
the hdf5/ directory. | ||
|
||
I have not tried to compile this with compilers other than GCC, MINGW, | ||
Intel ICC or MSVC yet. Please report your experiences with your own | ||
platforms. | ||
|
||
Testing Blosc | ||
============= | ||
|
||
Go to the test/ directory and issue:: | ||
|
||
$ make test | ||
|
||
These tests are very basic, and only valid for platforms where GNU | ||
make/gcc tools are available. If you really want to test Blosc the | ||
hard way, look at: | ||
|
||
http://blosc.pytables.org/trac/wiki/SyntheticBenchmarks | ||
|
||
where instructions on how to intensively test (and benchmark) Blosc | ||
are given. If while running these tests you get some error, please | ||
report it back! | ||
|
||
Wrapper for Python | ||
================== | ||
|
||
Blosc has an official wrapper for Python. See: | ||
|
||
https://github.com/FrancescAlted/python-blosc | ||
|
||
Filter for HDF5 | ||
=============== | ||
|
||
For those that want to use Blosc as a filter in the HDF5 library, | ||
there is a sample implementation in the hdf5/ directory. | ||
|
||
Mailing list | ||
============ | ||
|
||
There is an official mailing list for Blosc at: | ||
|
||
blosc@googlegroups.com | ||
http://groups.google.es/group/blosc | ||
|
||
Acknowledgments | ||
=============== | ||
|
||
I'd like to thank the PyTables community that have collaborated in the | ||
exhaustive testing of Blosc. With an aggregate amount of more than | ||
300 TB of different datasets compressed *and* decompressed | ||
successfully, I can say that Blosc is pretty safe now and ready for | ||
production purposes. Also, Valentin Haenel did a terrific work fixing | ||
typos and improving docs and the plotting script. | ||
|
||
|
||
---- | ||
|
||
**Enjoy data!** |
Oops, something went wrong.